Querying Substrate Storage via RPC

In this post, we will investigate how you can interact with the Substrate RPC endpoint in order to read storage items from your Substrate runtime.

Most of the posts I have written about Substrate so far have showed you how easy it is to build custom blockchains with this next generation framework. However, there is an entire set of parallel development and tools needed to enable users to easily interact with these new blockchain systems.

Our ultimate goal in this post is to query the balance of a Substrate user using the Substrate RPC. Along the way, we will paint a better picture of how Substrate interacts with the outside world by investigating storage structures, hashing algorithms, encoding schemes, public endpoints, metadata, and more!

Substrate RPC Methods

Substrate provides a set of RPC methods by default which allow you to interact, query, and submit to the actual node. The available RPC methods that Substrate exposes are documented as part of the Polkadot-JS docs.

There are 4 category of RPC methods:

To query the balance of a Substrate user, we will need to read into the runtime storage of the Balances module. This is done by calling the getStorage method in state:

getStorage(key: StorageKey, block?: Hash): StorageData

summary: Retrieves the storage for a key

Note that specifying a block here is optional. By default, it will query the latest block.

The actual RPC method name is generated by combining the category with the documented function name, like so:

However, to start simple, we will first query the Metadata endpoint for our Substrate node, which requires only knowledge of the method name: state_getMetadata.

Substrate RPC Endpoint

To actually call these methods, you need access to a Substrate RPC endpoint. When you start a local Substrate node, two endpoints are made available to you:

Most of the Substrate front-end libraries and tools use the more powerful WebSocket endpoint to interact with the blockchain. Through WebSockets, you can subscribe to various items, like events, and receive push notifications whenever changes in your blockchain occur.

For the purposes of this post, we will continue to keep things simple and use the HTTP endpoint to make JSON-RPC queries to our blockchain.

Public --dev Node

If you do not want to set up a local node just to test storage queries, there exists a Substrate --dev node that exposes a JSON-RPC endpoint at https://dev-node.substrate.dev:9933/ that accepts POST requests over SSL.

So let’s use this to call the Metadata endpoint:

$ curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "state_getMetadata"}' https://dev-node.substrate.dev:9933/

> {"jsonrpc":"2.0","result":"0x6d65746107481853797374656d011853797374656d3c304163636f756e744e6f6e636501010130543a3a4163636f756e74496420543a3a496e64657800200000000000000000047c2045787472696e73696373206e6f6e636520666f72206163636f756e74732e3845787472696e736963436f756e...

Yay! A basic RPC call to get the metadata from Substrate is successful! However, you will notice the result is a large hex value, which really isn’t that helpful…

There is more to the story.

Substrate Encoding

What we haven’t touched on yet are the various encoding mechanisms used by Substrate to both optimize serialization of data, but also provide safeties to the blockchain system.

SCALE Codec

If we try to naively decode the hex returned from the metadata endpoint using JavaScript, we get something like:

// From StackOverflow question 3745666
function hex_to_string(metadata) {
  return metadata.match(/.{1,2}/g).map(function(v){
    return String.fromCharCode(parseInt(v, 16));
  }).join('');
}

hex_to_string("0x6d65746107481853797374656d011853797374656d3c304163636f756e744e6f6e636501010130543a3a4163636f756e74496420543a3a496e64657800200000000000000000047c2045787472696e73696373206e6f6e636520666f72206163636f756e74732e3845787472696e736963436f756e...")

> "\u0000meta\u0007H\u0018System\u0001\u0018System<0AccountNonce\u0001\u0001\u00010T::AccountId T::Index\u0000 \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004| Extrinsics nonce for accounts.8ExtrinsicCoun..."

There is real data in there! However, it is not well formed.

To correctly parse the metadata, you will need to become familiar with is Parity’s SCALE codec:

SCALE is a light-weight format which allows encoding (and decoding) which makes it highly suitable for resource-constrained execution environments like blockchain runtimes and low-power, low-memory devices.

Parity uses SCALE for a number of reasons. Gav mentioned that:

Using the SCALE codec and parsing the Substrate metadata could be it’s own blog post, so I will not go much deeper here; I just wanted to point out the main encoding scheme used by Substrate, and which shows up in the examples we have done so far.

Storage Keys

For our goal, what we really want to learn is how to generate the storage keys for our various runtime storage items.

Substrate has a single key-value database for powering the entire blockchain framework. From this minimal data structure, additional abstractions can be constructed such as a Merkle Patricia tree (“trie”) that is used throughout Substrate.

At a base level, to gain access to any runtime storage item, you simply need to know it’s storage key for the core key-value database. To prevent key collisions, a special schema is used to generate keys for Runtime module storage items:

This may not make a lot of sense right now, but we will do some practical examples below to hopefully clarify.

Historical Info: Note that for storage values we use the XXHash (a non-crypographic hash algorithm), whereas for storage maps we use Blake-256. It used to be that XXHash was used in both situations, however there were concerns about attacks where external users could manipulate storage maps to generate storage keys to collide with one another. The same issue does not arise for storage values because the seed used in the hash is not manipulatable by external parties. XXHash is an order of magnitude faster in real world situations, so we continue to use it when possible, but for added cryptographic security guarantees, we need to use Blake256.

Querying Runtime Storage

We are almost to the finish line. Now that you know the different storage key encoding patterns, we can try to construct and query the runtime storage for a Substrate chain. Since you will need to use some cryptographic hash functions to try this yourself, I have loaded them for you on this blog post.

Open your browser console, and you will find utility functions under util.*, util_crypto.*, and keyring.*. These come from the polkadot-js/common and will give you access to the hash functions like util_crypto.xxhashAsHex or util_crypto.blake2AsHex.

Storage Value Query

Let’s start with a simple storage value, for instance getting the Sudo user for a Substrate chain. The module name is Sudo and the storage item which holds the AccountId is named Key.

Thus we would do the following:

util_crypto.xxhashAsHex(util.stringToU8a("Sudo Key"), 128)

> "0x50a63a871aced22e88ee6466fe5aa5d9"

Note: Note that we specified to use the 128 bit version of XXHash.

Now we can form an RPC request using this value as the params when calling the state_getStorage endpoint:

$ curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "state_getStorage", "params": ["0x50a63a871aced22e88ee6466fe5aa5d9"]}' https://dev-node.substrate.dev:9933/

> {"jsonrpc":"2.0","result":"0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d","id":1}

Success! The result here is the SCALE encoded AccountID of the Sudo user:

keyring.encodeAddress("0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d")

> "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY"

This is the familiar Alice account which we would expect on a --dev chain, and also matches what we get using the Polkadot-JS UI:

Sudo Key for the Substrate `--dev` node

Storage Map Query

As a final challenge, we will look to query a storage map like the balance of an account. The module name is Balances and the storage item we are interested in is named FreeBalance. They mapping for this storage item is from AccountId -> Balance, so the storage item key we want to use is an AccountId.

Remember we need to use Blake-256 and a slightly different pattern for generating the key for these kinds of storage items:

util_crypto.blake2AsHex([...util.stringToU8a("Balances FreeBalance"), ...util.hexToU8a("0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d")], 256)

> "0x7f864e18e3dd8b58386310d2fe0919eef27c6e558564b7f67f22d99d20f587bb"

Just like before, we can form an RPC request using this value as the params:

$ curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "state_getStorage", "params": ["0x7f864e18e3dd8b58386310d2fe0919eef27c6e558564b7f67f22d99d20f587bb"]}' https://dev-node.substrate.dev:9933/

{"jsonrpc":"2.0","result":"0x0000a0dec5adc9353600000000000000","id":1}

The result here is now a SCALE encoded version of the Balance type, which is a u64 and thus trivially decodable (now that you know it is little endian):

util.hexToBn("0x0000a0dec5adc9353600000000000000", { isLe: true }).toString()

> "1000000000000000000000"

Woohoo!

Next Steps

If you made it this far, you probably have come to the same conclusion as me, which is that interacting with the Substrate RPC is not trivial. Substrate is optimized for performance, bandwidth, and execution, which leaves tasks like encoding and decoding of transactions, storage, metadata, etc… to the outside world.

That being said, once you are able to walk through these examples step by step, I think it becomes easier to understand what is going on, and even reproduce this logic on other platforms and languages. Certainly this is needed for the future Substrate ecosystem.

I have started a project called Substrate RPC Examples:

https://github.com/shawntabrizi/substrate-rpc-examples

The idea of this project is to provide some easy to read, “minimal library magic” examples of interacting with the Substrate RPC. So far, I have only used the tools available in util, util_crypto, and keyring, and ideally this can be reduced by introducing a few hand written functions.

The two samples I have described in this blog post (getting metadata, querying storage) are implemented. I hope to also add to it an example of a balance transfer, which will show how to sign a message. If you have any good ideas or examples that you would want to share with the world, feel free to open a PR.

I think the next follow up from this post should be a deep dive into the SCALE codec and how you can turn the Metadata you receive from a Substrate node into valid JSON.

As always, if you are enjoying the content I have produced, take a look at my donations page to see how you can continue to support me.

GitHub

This blog post has an associated GitHub project: substrate-rpc-examples