GraphQL interface to Ethereum node data

Arachnid · March 1, 2019, 5:19am

Good thought. I hadn’t recognised the significance of using ID to other generic graphql tools.

I think your option 2 is better - make IDs opaque to callers, with no guarantees except they’re unique.

Arachnid · March 7, 2019, 10:29pm

I’ve updated the EIP to add a Pending type, and to move account, call and estimateGas to Pending and Block. This also makes it possible now to query transactions in the pending pool, which the schema didn’t previously offer.

Pending has a subset of fields from Block that make sense based on the available data.

I’ve also written a PR updating the geth implementation to reflect these changes.

zyfrank · May 3, 2019, 12:42am

It is a little strange for query ‘’‘block\miner’’’ to have a ‘’’(block :Long)’’’ augment.

Not sure if people have requirement of query one block for another block’s miner.

shemnon · May 3, 2019, 7:15pm

Question about output formats. From the current spec

# BigInt is a large integer. Input is accepted as either a JSON number or as a string.
# Strings may be either decimal or 0x-prefixed hexadecimal. Output values are all
# 0x-prefixed hexadecimal.
scalar BigInt
# Long is a 64 bit unsigned integer.
scalar Long

When outputting the json for a transaction some of the fields are Long, such as gasUsed. Should those be outputted as standard json numbers or as hex strings as though it was a BigInt. The lack of formatting instructions leads me to believe the latter.

Arachnid · May 3, 2019, 11:55pm

Any argument that specifies an account has a block number - so you can specify what block you want to fetch that account at.

Good point, we should specify this. The intention is that longs are read and formatted as numbers; GraphQL only specifies a 31 bit integer type, and it’s useful to be able to use a longer (52 bit, safely in Javascript) numeric type.

shemnon · May 4, 2019, 1:28am

Also, it may be worth referencing the GraphQL website when saying “implement a graphql endpoint” because there are about 3 ways to do it. GET, POST applicaiton/json, and POST applicaiton/graphql. No need to spell it out in the spec, just reference their page: https://graphql.org/learn/serving-over-http/

MicahZoltu · May 24, 2019, 7:17am

Recommend changing the recommended default port to 80, since that is the standard port for HTTP traffic. Most clients who implement this will likely provide a mechanism to override the port selection, but I do not think this standard should recommend against an incredibly well established standard as the default.

Alternatively, remove the port and endpoint path form this standard entirely, as it isn’t something that needs to be standardized on. Discovery of a GraphQL endpoint is necessary, and during that discovery process acquiring the port, path, ip/domain can all be done at the same time. Since you already have to discover the ip/domain, discovery of the port and path along with that is totally reasonable (and likely going to happen anyway).

MicahZoltu · May 24, 2019, 7:25am

Recommend changing from an HTTP recommendation to an HTTPS recommendation. Alternatively, perhaps recommend HTTPS if bound to anything other than 127.0.0.1 and HTTP when bound to 127.0.0.1 (I recognize that SSL certificates that browsers accept are complicated when self-signed).

MicahZoltu · May 24, 2019, 7:41am

Bundling my comments since apparently Discourse doesn’t like multiple comments in a row.

Why 0x-prefixed hex strings for Bytes32, Address, Bytes, and BigInt? Base32, Base58 and Base64 all compress data better and are well standardized so relatively easy to extract data from in any language. The 0x prefix not only is wasted bytes on the wire, but it also makes extracting the data more complicated in most cases as you first have to strip the 0x characters off of the string (exception for JavaScript which is notorious for “guessing” what you mean when processing data, and it will treat a string that starts with a 0x as a hex string, even if it isn’t).

GraphQL doesn’t specify a wire serialization mechanism, though JSON is certainly the most common. This specification should state what the wire serialization will be. https://graphql.github.io/graphql-spec/draft/#sec-Serialization-Format

If JSON serialization is used, then the Long type may be problematic, pragmatically, as all JavaScript deserializers I’m aware of automatically deserialize any JSON number into a JavaScript number, which cannot hold a 64-bit unsigned integer. Ideally, these would be deserialized into a bigint, but at the moment bigint isn’t supported (per spec) in JSON (de)serializing. Consider either putting Longs into encoded strings like BigInt for usability reasons. Alternatively, consider specifying Long as 52 bits, since I believe every language that is used in the Ethereum ecosystem currently supports integers up to 52-bits wide.

All arrays should be not-null (!). Currently, a number of array properties are set as nullable (IIUC). In almost all cases, if there is no data an empty array should be returned. The only time a nullable array should be used is if you need to differentiate between the empty set and a sentinel set.

Specify what null means for any nullable property. Why is Block.parent nullable? Is this for the genesis block? The comment should specify what null means. This is an exapmle of the problem, but in general anywhere a property has a sentinel value (such as null), the comment should indicate what null means.

Recommend using consistent indentation in the GraphQL schema. At the moment, lines are not consistently indented.

The new interface doesn’t appear to support operating against the pending block. I’m not against this (and in fact, I’m generally for it), but the Backward Compatibility section should mention that.

The behavior when both parameters for Query.block(number, hash) are provided should be specified in the comments. The same goes for any function that has mutually exclusive parameters.

Recommend versioning the protocol. This can be done via the recommended default path if that is retained, e.g., /graphql/v1.

Recommend a discovery endpoint or query a user can make that will give details about the current protocol version. This could be used when doing service discovery to find out if graphql is available from the server, what path/port it lives at, and what version(s) of the protocol it supports. While it is possible for the app to just probe a number of different endpoints, it is simpler if there is a single GET request that can be made to a well known path on a server to see if it supports GraphQL and if so, what versions and where.

Add ability to query for Chain ID. This is necessary for replay protection as well as making it easier to properly alert the user when they are communicating with an endpoint that is not delivering data from the correct chain. Also, since signing is extracted out (I’m a fan), we need a way to ensure that the signer is speaking to the same chain as the GraphQL provider.

Consider adding support for fetching the output data of an on-chain transaction. Since light clients may not have access to this, especially for ancient transactions, it could be a nullable field. Alternatively, can someone champion an EIP to get return data into the receipts please? Even if it was constrained to 32-bytes, this would be a huge boon on dapp development. It should be noted that at the moment dapps that need return data write an event log with their transaction that contains the result, this results in more wasted resources than if return data was simply included in the receipt directly.

It appears that the Log filter has been ported forward basically as is. I feel like we can probably do better with the log filter query language than the current 2 dimensional array. I am not familiar enough with GraphQL’s query to be able to assert that we can do better, but it feels like we could do better (like supporting logical AND and NOT)

shemnon · May 26, 2019, 5:56am

This conforms to the JSON-RPC conventions. If we are worried about wire compression and HTTP GZip isn’t sufficient then we sholdn’t be using JSON result but RLP.

The fields using Long won’t reasonably exceed 52 bits(gas, indexes, status, counts, protocol versions, block numbers), bringing that restriction in will in practice never matter and leak the JSON encoding into the spec.

There is a root level Pending object that you can check account details, estimate gas, etc. That looks to be how such queries are done. What constitutes the “pending” query would be nice to have specified.

This would be useful not just for GraphQL but all standard endpoints: JSON-RPC, WebSockets, Disovery, etc.

This is a good idea, but will have to wait and see which ChainID EIP wins. If it is current only it could be under Query, if not it would need to be on Block since the historical ChainID changes.

It should be noted that storing receipts is one of the biggest sections of unused data in a blockchain sync, so adding more data to the receipt will meet with some pushback.

There are no good graphql facilities for this unless a query structure is encoded into the schema. the 2 dimensional array has proven to work with a minimum of overhead.

MicahZoltu · May 26, 2019, 6:56am

This new endpoint gives us an opportunity to fix mistakes of the past. I am of the opinion that 0x prefixed hex encoding was a mistake in the Ethereum JSON-RPC API, and I would like to see us not port that mistake forward.

IIUC, you are arguing that the specification actually is 52-bits, not 64-bits, but in a way that is undocumented. If a future version of this specification includes a new variable that is > 52-bits, it would be valid but it would cause a bunch of problems. If the intent is that a Long is never greater than 52-bits then the specification should indicate that, rather than just having it be tribal knowledge.

This is not a full replacement with the Ethereum JSON-RPC API. For example, you can do an eth_call against a pending block via the Ethereum JSON-RPC, but I don’t believe you can do that here. Again, I think this is good, but should be mentioned in the Backward Compatibility section.

As a dapp developer I can assert that this is not the case. Building dapps against the log filter system is a PITA, not only is it difficult to understand, after you understand it it is difficult to read and comprehend. It also is not as expressive as is often desired and dapps I have built have had to work around its lack of expressiveness. Overall I think that the current 2d array solution results in a poor developer experience and an inability to execute certain queries.

shemnon · May 26, 2019, 2:07pm

You use the pending object and do a call just like the query object

{
  pending {
    call(data: {from: "a94f5374fce5edbc8e2a8697c15331677e6ebf0b", to: "0x6295ee1b4f6dd65047762f924ecd367c17eabf8f", data: "0x12a7b914"}) {
      data
      status
    }
  }
}

Arachnid · May 26, 2019, 8:33pm

Fair enough.

I generally agree that Ethereum’s use of 0x prefixed hex everywhere is problematic. I’m open to consider converting some of these types to base64.

Address should definitely remain 0x-prefixed hex, as it’s the canonical textual representation of an Ethereum address.

BigInt currently supports numbers, decimal strings, and 0x-prefixed hexadecimal; that’s consistent and I don’t think it should be changed.

The filter format is restricted by what can be efficiently filtered for in a bloom filter. You’re welcome to propose an alternate filter syntax, though!

rjl493456442 · July 24, 2019, 3:13am

Can we consider adding some light client friendly query? For example add an additional query method named header(number: Long, hash: Bytes32): Header.

Currently we can retrieve header information via JSON-RPC or GraphQL. But essentially this type of RPCs need to retrieve block first and then return a header instance.

In the Geth side, we just merge a PR https://github.com/ethereum/go-ethereum/pull/19669 to implement GetHeaderBy* API so that we can retrieve header directly, which is super friendly to light client.

What’s your opinion about it @Arachnid

Arachnid · July 24, 2019, 9:15pm

Presently the geth implementation fetches the header, and only fetches the full block when required, so a special API shouldn’t be required.

shemnon · July 24, 2019, 9:40pm

I agree with @Arachnid, this sounds like an implementation optimization that doesn’t need to bleed into the API. This would also involve introducing a Header type where the only real difference would be the inability to enumerate ommers and transactions. Ommers (which are just headers) are already modeled as blocks.

adamschmideg · July 30, 2019, 11:22am

I’m interested in moving this EIP forward. I think a few things are missing

Merge the schema of the proposal and the EthQL implementation. Update the proposal in some cases and update EthQL in the rest.
Agree on a Pagination concept.
Define error codes, maybe in accordance with JSON RPC Error Codes Improvement Proposal. See the GraphQL spec on errors.
Testing and test cases, hopefully aligned with JSON RPC test cases.
A champion who is motivated in getting this EIP accepted, has an in-depth knowledge of API design, GraphQL, and hopefully of the implementation details of one or more clients.
A commitment from some client teams that they would implement it. It’s already merged in go-ethereum and released in v1.9.0. Trinity has an open issue for it. I don’t know of other clients.

shemnon · July 30, 2019, 3:52pm

I think these goals would fit better in a new “GraphQL revision 2” EIP. As a first revision EIP-1767 is implemented and operational, no need to make perfect the enemy of the good and go back and change what is out there.

Pantheon also shipped support for GraphQL, so there are at least 2 clients with fully operational support deployed.

adamschmideg · July 31, 2019, 2:44pm

Let’s go one by one.

Unify EthQL and EIP schemas → I think there’s minor changes already proposed in this thread, like changing Long to BlockNumber. But it’s not a showstopper, I agree.
Pagination → Not a showstopper.
Error codes → Not a showstopper, but definitely a big win over the current JSON RPC.
Test cases → EthQL and Pantheon already have them. A pointer to Pantheon test cases may suffice.
Champion → I think we need one
Clients → yep, 2 clients supporting it and 1 WIP sounds nice.

adamschmideg · July 31, 2019, 3:00pm

This is a proposal how to unify the top level queries in the EthQL and the current EIP schema. It includes some changes to the EIP schema.

EthQL	Current EIP 1767	Proposed EIP 1767	Update in EthQL	Update in EIP	Note
_: String		-	-	-	Only in EthQL
account(address: Address!): Account		account(address: Address!): Account	-	Add it
block(number: BlockNumber, hash: Bytes32, tag: BlockTag): Block	block(number: Long, hash: Bytes32): Block	block(number: BlockNumber, hash: Hash): Block	Change Bytes32 to Hash for hash	Change Long to BlockNumber for number, Bytes32 to Hash for hash
blockOffset(number: BlockNumber, hash: Bytes32, tag: BlockTag, offset: Int!): Block		-	-	-	Only in EthQL
blocks(numbers: [BlockNumber], hashes: [Bytes32]): [Block]	blocks(from: Long!, to: Long): [Block!]!	blocks(numbers: [BlockNumber], hashes: [Hash]): [Block]	Change Bytes32 to Hash for hash	Support arbitrary numbers and hashes. Drop to and from args (see blocksRange for that functionality)
blocksRange(numberRange: [BlockNumber], hashRange: [Bytes32]): [Block]		blocksRange(numberRange: [BlockNumber], hashRange: [Bytes32]): [Block]	-	Add it
	gasPrice: BigInt!	gasPrice: BigInt!	Add it	-
health: String!		-	-	-	Only in EthQL
	logs(filter: FilterCriteria!): [Log!]!	logs(filter: FilterCriteria!): [Log!]!	Add it	-
	pending: Pending!	pending: Pending!	Add it	-
	protocolVersion: Int!	protocolVersion: Int!	Add it	-
	syncing: SyncState	syncing: SyncState	Add it	-
transaction(hash: Bytes32): Transaction	transaction(hash: Bytes32!): Transaction	transaction(hash: Hash!): Transaction	Change Bytes32 to Hash for hash	Change Bytes32 to Hash for hash