forked from cerc-io/ipld-eth-server
Merge pull request #11 from vulcanize/storage-transformers-readme
Add storage transformers readme
This commit is contained in:
commit
df521eea7f
167
pkg/transformers/factories/storage/EXAMPLE.md
Normal file
167
pkg/transformers/factories/storage/EXAMPLE.md
Normal file
@ -0,0 +1,167 @@
|
|||||||
|
# Storage Transformer Example
|
||||||
|
|
||||||
|
In the Storage Transformer README, we went over code that needs to be written to add a new storage transformer to VulcanizeDB.
|
||||||
|
In this document, we'll go over an example contract and discuss how one would go about watching its storage.
|
||||||
|
|
||||||
|
## Example Contract
|
||||||
|
|
||||||
|
For the purposes of this document, we'll be assuming that we're interested in watching the following contract:
|
||||||
|
|
||||||
|
```solidity
|
||||||
|
pragma solidity ^0.5.1;
|
||||||
|
|
||||||
|
contract Contract {
|
||||||
|
uint256 public num_addresses;
|
||||||
|
mapping(address => uint) public addresses;
|
||||||
|
|
||||||
|
event AddressAdded(
|
||||||
|
address addr,
|
||||||
|
uint256 num_addrs
|
||||||
|
);
|
||||||
|
|
||||||
|
constructor() public {
|
||||||
|
addresses[msg.sender] = 1;
|
||||||
|
num_addresses = 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
function add_address(address addr) public {
|
||||||
|
bool exists = addresses[addr] > 0;
|
||||||
|
addresses[addr] = addresses[addr] + 1;
|
||||||
|
if (!exists) {
|
||||||
|
emit AddressAdded(addr, ++num_addresses);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Disclaimer: this contract has not been audited and is not intended to be modeled or used in production. :)
|
||||||
|
|
||||||
|
This contract persists two values in it's storage:
|
||||||
|
|
||||||
|
1. `num_addresses`: the total number of unique addresses known to the contract.
|
||||||
|
2. `addresses`: a mapping that records the number of times an address has been added to the contract.
|
||||||
|
|
||||||
|
It also emits an event each time a new address is added into the contract's storage.
|
||||||
|
|
||||||
|
## Custom Code
|
||||||
|
|
||||||
|
In order to monitor the state of this smart contract, we'd need to implement: an event transformer, a mappings namespace, and a repository.
|
||||||
|
We will go through each of these in turn.
|
||||||
|
|
||||||
|
### Event Transformer
|
||||||
|
|
||||||
|
Given that the contract's storage includes a mapping, `addresses`, we will need to be able to identify the keys to that mapping that exist in the system so that we can recognize contract storage keys that correspond to non-zero values in that mapping.
|
||||||
|
|
||||||
|
The simplest way to be aware of keys used in a contract's mapping is to listen for contract events that emit the keys that are used in its mapping(s).
|
||||||
|
Since this contract includes an event, `AddressAdded`, that is emitted each time a new address is added to the `addresses` mapping, we will want to listen for those events and cache the adddresses that map to non-zero values.
|
||||||
|
|
||||||
|
Please see the event transformer README for detailed instructions about developing this code.
|
||||||
|
In short, it should be feasible to recognize `AddressAdded` events on the blockchain and parse them to keep a record of addresses that have been added to the system.
|
||||||
|
|
||||||
|
### Mappings
|
||||||
|
|
||||||
|
If we point an ethereum node at a blockchain hosting this contract and our node is equipped to write out storage changes happening on this contract, we will expect such changes to appear each time `add_address` (which modifies the `addresses` mapping) is called.
|
||||||
|
|
||||||
|
In order for those changes - which include raw hex versions of storage keys and storage values, to be useful for us - we need to know how to recognize and parse them.
|
||||||
|
Our mappings file should assist us with both of these tasks: the `Lookup` function should recognize raw storage keys and return known metadata about the storage value.
|
||||||
|
|
||||||
|
In order to perform this lookup, the mappings file should maintain its own mapping of known storage keys to the corresponding storage value metadata.
|
||||||
|
This internal mapping should contain the storage key for `num_addresses` as well as a storage key for each `addresses` key known to be associated with a non-zero value.
|
||||||
|
|
||||||
|
#### num_addresses
|
||||||
|
|
||||||
|
`num_addresses` is the first variable declared on the contract, and it is a simple (non-array, non-mapping) type.
|
||||||
|
Therefore, we know that its storage key is `0000000000000000000000000000000000000000000000000000000000000000`.
|
||||||
|
The storage key for non-array and non-mapping variables is (usually*) the index of the variable on the contract's storage.
|
||||||
|
If we see a storage diff being emitted from this contract with this storage key, we know that the `num_addresses` variable has been modified.
|
||||||
|
|
||||||
|
In this case, we would expect that the call `mappings.Lookup("0000000000000000000000000000000000000000000000000000000000000000")` would return metadata corresponding to the `num_addresses` variable.
|
||||||
|
This metadata would probably look something like:
|
||||||
|
|
||||||
|
```golang
|
||||||
|
shared.StorageValueMetadata{
|
||||||
|
Name: "num_addresses",
|
||||||
|
Keys: nil,
|
||||||
|
Type: shared.Uint256,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<sup>*</sup> Occasionally, multiple variables may be packed into one storage slot, which complicates a direct translation of the index of the variable on the contract to its storage key.
|
||||||
|
|
||||||
|
#### addresses
|
||||||
|
|
||||||
|
`addresses` is the second variable declared on the contract, but it is a mapping.
|
||||||
|
Since it is a mapping, the storage key is more complex than `0000000000000000000000000000000000000000000000000000000000000001` (which would be the key for the variable if it were not an array or mapping).
|
||||||
|
Having a single storage slot for an entire mapping would not work, since there can be an arbitrary number of entries in a mapping, and a single storage value slot is constrained to 32 bytes.
|
||||||
|
|
||||||
|
The way that smart contract mappings are maintained in storage (in Solidity) is by creating a new storage key/value pair for each entry in the mapping, where the storage key is a hash of the occupied slot's key concatenated with the mapping's index on the contract.
|
||||||
|
Given an occupied slot's key, `k`, and a mapping's index on the contract, `i`, we can generate the storage key with the following code:
|
||||||
|
|
||||||
|
```golang
|
||||||
|
func GetMappingStorageKey(k, i string) string {
|
||||||
|
return common.BytesToHash(crypto.Keccak256(common.FromHex(k + i))).Hex()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
If we were to call the contract's `add_address` function with `0xde0B295669a9FD93d5F28D9Ec85E40f4cb697BAe`, we would expect to see an `AddressAdded` event emitted, with `0xde0B295669a9FD93d5F28D9Ec85E40f4cb697BAe` in its payload.
|
||||||
|
From that event, we would know that there exists in the contract's storage a storage key of:
|
||||||
|
|
||||||
|
```golang
|
||||||
|
GetMappingStorageKey("0xde0B295669a9FD93d5F28D9Ec85E40f4cb697BAe", "0000000000000000000000000000000000000000000000000000000000000001")
|
||||||
|
```
|
||||||
|
|
||||||
|
Executing the above code results in: `0x0f96a1133cfd5b94c329aa0526b5962bd791dbbfc481ca82f7d4a439e1e9bc40`.
|
||||||
|
|
||||||
|
Therefore, the first time `add_address` was called for this address, we would also expect to see a storage diff with a key of `0x0f96a1133cfd5b94c329aa0526b5962bd791dbbfc481ca82f7d4a439e1e9bc40` and a value of `0000000000000000000000000000000000000000000000000000000000000001`.
|
||||||
|
This would be the indication that in contract storage, the address `0xde0B295669a9FD93d5F28D9Ec85E40f4cb697BAe` maps to the value 1.
|
||||||
|
|
||||||
|
Given that we knew this address was a key in the mapping from our event transformer, we would expect a call to `mappings.Lookup("0x0f96a1133cfd5b94c329aa0526b5962bd791dbbfc481ca82f7d4a439e1e9bc40")` to return metadata corresponding to _this slot_ in the addresses mapping:
|
||||||
|
|
||||||
|
```golang
|
||||||
|
shared.StorageValueMetadata{
|
||||||
|
Name: "addresses,
|
||||||
|
Keys: map[Key]string{Address: "0xde0B295669a9FD93d5F28D9Ec85E40f4cb697BAe"},
|
||||||
|
Type: shared.Uint256,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Repository
|
||||||
|
|
||||||
|
Once we have recognized a storage diff, we can decode the storage value to the data's known type.
|
||||||
|
Since the metadata tells us that the above values are `uint256`, we can decode a value like `0000000000000000000000000000000000000000000000000000000000000001` to `1`.
|
||||||
|
|
||||||
|
The purpose of the contract-specific repository is to write that value to the database in a way that makes it useful for future queries.
|
||||||
|
Typically, the involves writing the block hash, block number, decoded value, and any keys in the metadata to a table.
|
||||||
|
|
||||||
|
The current repository interface has a generalized `Create` function that can accept any arbitrary storage row along with it's metadata.
|
||||||
|
This is deliberate, to facilitate shared use of the common storage transformer.
|
||||||
|
An implication of this decision is that the `Create` function typically includes a `switch` statement that selects which table to write to, as well as what data to include, based on the name of the variable as defined in the metadata.
|
||||||
|
|
||||||
|
An example implementation of `Create` for our example contract above might look like:
|
||||||
|
|
||||||
|
```golang
|
||||||
|
func (repository AddressStorageRepository) Create(blockNumber int, blockHash string, metadata shared.StorageValueMetadata, value interface{}) error {
|
||||||
|
switch metadata.Name {
|
||||||
|
case "num_addresses":
|
||||||
|
_, err := repository.db.Exec(`INSERT INTO storage.num_addresses (block_hash, block_number, n) VALUES ($1, $2, $3)`,
|
||||||
|
blockHash, blockNumber, value)
|
||||||
|
return err
|
||||||
|
case "addresses":
|
||||||
|
_, err := repository.db.Exec(`INSERT INTO storage.addresses (block_hash, block_number, address, n) VALUES ($1, $2, $3, $4)`,
|
||||||
|
blockHash, blockNumber, metadata.Keys[Address], value)
|
||||||
|
return err
|
||||||
|
default:
|
||||||
|
panic(fmt.Sprintf("unrecognized contract storage name: %s", metadata.Name))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
With our very simple address storing contract, we would be able to read it's storage diffs by implementing an event transformer, a mappings, and a repository.
|
||||||
|
|
||||||
|
The mappings would be able to lookup storage keys reflecting `num_addresses` or any slot in `addresses`, using addresses derived from watching the `AddressAdded` event for the latter.
|
||||||
|
|
||||||
|
The repository would be able to persist the value or `num_addresses` or any slot in `addresses`, using metadata returned from the mappings.
|
||||||
|
|
||||||
|
The mappings and repository could be plugged into the common storage transformer, enabling us to know the contract's state as it is changing.
|
124
pkg/transformers/factories/storage/README.md
Normal file
124
pkg/transformers/factories/storage/README.md
Normal file
@ -0,0 +1,124 @@
|
|||||||
|
# Watching Contract Storage
|
||||||
|
|
||||||
|
One approach VulcanizeDB takes to caching and indexing smart contracts is to ingest raw contract storage values.
|
||||||
|
Assuming that you are running an ethereum node that is writing contract storage changes to a CSV file, VulcanizeDB can parse them and persist the results to postgres.
|
||||||
|
|
||||||
|
## Assumptions
|
||||||
|
|
||||||
|
The current approach for caching smart contract storage diffs assumes that you are running a node that is writing contract storage diffs to a CSV file.
|
||||||
|
The CSV file is expected to have 5 columns: contract address, block hash, block number, storage key, storage value.
|
||||||
|
|
||||||
|
We have [a branch on vulcanize/parity-ethereum](https://github.com/vulcanize/parity-ethereum/tree/watch-storage-diffs) that enables running a node that writes storage diffs this way.
|
||||||
|
We also have [sample data](https://github.com/8thlight/maker-vulcanizedb/pull/132/files) that comes from running that node against Kovan through block 9796184.
|
||||||
|
|
||||||
|
Looking forward, we would like to isolate this assumption as much as possible.
|
||||||
|
We may end up needing to read CSV data that is formatted differently, or reading data from a non-CSV source, and we do not want resulting changes to cascade throughout the codebase.
|
||||||
|
|
||||||
|
## Shared Code
|
||||||
|
|
||||||
|
VulcanizeDB has shared code for continuously reading from the CSV file written by the ethereum node and writing a parsed version of each row to postgres.
|
||||||
|
|
||||||
|
### Storage Watcher
|
||||||
|
|
||||||
|
The storage watcher is responsible for continuously delegating CSV rows to the appropriate transformer as they are being written by the ethereum node.
|
||||||
|
It maintains a mapping of contract addresses to transformers, and will ignore storage diff rows for contract addresses that do not have a corresponding transformer.
|
||||||
|
|
||||||
|
The storage watcher is currently initialized from the `parseStorageDiffs` command, which also adds transformers that the watcher should know about in its mapping of addresses to transformers.
|
||||||
|
|
||||||
|
### Storage Transformer
|
||||||
|
|
||||||
|
The storage transformer is responsible for converting raw contract storage hex values into useful data and writing them to postgres.
|
||||||
|
The storage transformer depends on contract-specific implementations of code capable of recognizing storage keys and writing the matching (decoded) storage value to disk.
|
||||||
|
|
||||||
|
```golang
|
||||||
|
func (transformer Transformer) Execute(row shared.StorageDiffRow) error {
|
||||||
|
metadata, lookupErr := transformer.Mappings.Lookup(row.StorageKey)
|
||||||
|
if lookupErr != nil {
|
||||||
|
return lookupErr
|
||||||
|
}
|
||||||
|
value, decodeErr := shared.Decode(row, metadata)
|
||||||
|
if decodeErr != nil {
|
||||||
|
return decodeErr
|
||||||
|
}
|
||||||
|
return transformer.Repository.Create(row.BlockHeight, row.BlockHash.Hex(), metadata, value)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Custom Code
|
||||||
|
|
||||||
|
In order to watch an additional smart contract, a developer must create three things:
|
||||||
|
|
||||||
|
1. Mappings - specify how to identify keys in the contract's storage trie.
|
||||||
|
1. Repository - specify how to persist a parsed version of the storage value matching the recognized storage key.
|
||||||
|
1. Instance - create an instance of the storage transformer that uses your mappings and repository.
|
||||||
|
|
||||||
|
### Mappings
|
||||||
|
|
||||||
|
```golang
|
||||||
|
type Mappings interface {
|
||||||
|
Lookup(key common.Hash) (shared.StorageValueMetadata, error)
|
||||||
|
SetDB(db *postgres.DB)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A contract-specific implementation of the mappings interface enables the storage transformer to fetch metadata associated with a storage key.
|
||||||
|
|
||||||
|
Storage metadata contains: the name of the variable matching the storage key, a raw version of any keys associated with the variable (if the variable is a mapping), and the variable's type.
|
||||||
|
|
||||||
|
```golang
|
||||||
|
type StorageValueMetadata struct {
|
||||||
|
Name string
|
||||||
|
Keys map[Key]string
|
||||||
|
Type ValueType
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Keys are only relevant if the variable is a mapping. For example, in the following Solidity code:
|
||||||
|
|
||||||
|
```solidity
|
||||||
|
pragma solidity ^0.4.0;
|
||||||
|
|
||||||
|
contract Contract {
|
||||||
|
uint x;
|
||||||
|
mapping(address => uint) y;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The metadata for variable `x` would not have any associated keys, but the metadata for a storage key associated with `y` would include the address used to specify that key's index in the mapping.
|
||||||
|
|
||||||
|
The `SetDB` function is required for the mappings to connect to the database.
|
||||||
|
A database connection may be desired when keys in a mapping variable need to be read from log events (e.g. to lookup what addresses may exist in `y`, above).
|
||||||
|
|
||||||
|
### Repository
|
||||||
|
|
||||||
|
```golang
|
||||||
|
type Repository interface {
|
||||||
|
Create(blockNumber int, blockHash string, metadata shared.StorageValueMetadata, value interface{}) error
|
||||||
|
SetDB(db *postgres.DB)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A contract-specific implementation of the repository interface enables the transformer to write the decoded storage value to the appropriate table in postgres.
|
||||||
|
|
||||||
|
The `Create` function is expected to recognize and persist a given storage value by the variable's name, as indicated on the row's metadata.
|
||||||
|
|
||||||
|
The `SetDB` function is required for the repository to connect to the database.
|
||||||
|
|
||||||
|
### Instance
|
||||||
|
|
||||||
|
```golang
|
||||||
|
type Transformer struct {
|
||||||
|
Address common.Address
|
||||||
|
Mappings storage_diffs.Mappings
|
||||||
|
Repository storage_diffs.Repository
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A new instance of the storage transformer is initialized with the contract-specific mappings and repository, as well as the contract's address.
|
||||||
|
The contract's address is included so that the watcher can query that value from the transformer in order to build up its mapping of addresses to transformers.
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
To begin watching an additional smart contract, create a new mappings file for looking up storage keys on that contract, a repository for writing storage values from the contract, and initialize a new storage transformer instance with the mappings, repository, and contract address.
|
||||||
|
|
||||||
|
The new instance, wrapped in an initializer that calls `SetDB` on the mappings and repository, should be passed to the `AddTransformers` function on the storage watcher.
|
Loading…
Reference in New Issue
Block a user