Merge branch 'staging' into databaseConfig

This commit is contained in:
ana0 2019-04-11 14:19:57 -04:00 committed by GitHub
commit 964fc8bd22
9 changed files with 878 additions and 881 deletions

444
README.md
View File

@ -1,10 +1,27 @@
# Vulcanize DB
[![Build Status](https://travis-ci.org/vulcanize/vulcanizedb.svg?branch=master)](https://travis-ci.org/vulcanize/vulcanizedb)
[![Go Report Card](https://goreportcard.com/badge/github.com/vulcanize/vulcanizedb)](https://goreportcard.com/report/github.com/vulcanize/vulcanizedb)
## About
> Vulcanize DB is a set of tools that make it easier for developers to write application-specific indexes and caches for dapps built on Ethereum.
## Table of Contents
1. [Background](../staging/README.md#background)
1. [Dependencies](../staging/README.md#dependencies)
1. [Install](../staging/README.md#install)
1. [Usage](../staging/README.md#usage)
1. [Tests](../staging/README.md#tests)
1. [API](../staging/README.md#API)
1. [Contributing](../staging/README.md#contributing)
1. [License](../staging/README.md#license)
## Background
The same data structures and encodings that make Ethereum an effective and trust-less distributed virtual machine
complicate data accessibility and usability for dApp developers. VulcanizeDB improves Ethereum data accessibility by
providing a suite of tools to ease the extraction and transformation of data into a more useful state.
Vulcanize DB is a set of tools that make it easier for developers to write application-specific indexes and caches for dapps built on Ethereum.
## Dependencies
- Go 1.11+
@ -13,34 +30,32 @@ Vulcanize DB is a set of tools that make it easier for developers to write appli
- [Go Ethereum](https://ethereum.github.io/go-ethereum/downloads/) (1.8.23+)
- [Parity 1.8.11+](https://github.com/paritytech/parity/releases)
## Project Setup
Using Vulcanize for the first time requires several steps be done in order to allow use of the software. The following instructions will offer a guide through the steps of the process:
## Install
1. [Building the project](../staging/README.md#building-the-project)
1. [Setting up the database](../staging/README.md#setting-up-the-database)
1. [Configuring a synced Ethereum node](../staging/README.md#configuring-a-synced-ethereum-node)
1. Fetching the project
2. Installing dependencies
3. Configuring shell environment
4. Database setup
5. Configuring synced Ethereum node integration
6. Data syncing
### Installation
In order to fetch the project codebase for local use or modification, install it to your `GOPATH` via:
### Building the project
Download the codebase to your local `GOPATH` via:
`go get github.com/vulcanize/vulcanizedb`
Once fetched, dependencies can be installed via `go get` or (the preferred method) at specific versions via `golang/dep`, the prototype golang pakcage manager. Installation instructions are [here](https://golang.github.io/dep/docs/installation.html).
Move to the project directory and use [golang/dep](https://github.com/golang/dep) to install the dependencies:
In order to install packages with `dep`, ensure you are in the project directory now within your `GOPATH` (default location is `~/go/src/github.com/vulcanize/vulcanizedb/`) and run:
`cd $GOPATH/src/github.com/vulcanize/vulcanizedb`
`dep ensure`
After `dep` finishes, dependencies should be installed within your `GOPATH` at the versions specified in `Gopkg.toml`.
Once the dependencies have been successfully installed, build the executable with:
Lastly, ensure that `GOPATH` is defined in your shell. If necessary, `GOPATH` can be set in `~/.bashrc` or `~/.bash_profile`, depending upon your system. It can be additionally helpful to add `$GOPATH/bin` to your shell's `$PATH`.
`make build`
### Setting up the Database
If you are running into issues at this stage, ensure that `GOPATH` is defined in your shell.
If necessary, `GOPATH` can be set in `~/.bashrc` or `~/.bash_profile`, depending upon your system.
It can be additionally helpful to add `$GOPATH/bin` to your shell's `$PATH`.
### Setting up the database
1. Install Postgres
1. Create a superuser for yourself and make sure `psql --list` works without prompting for a password.
1. `createdb vulcanize_public`
@ -52,14 +67,12 @@ Lastly, ensure that `GOPATH` is defined in your shell. If necessary, `GOPATH` ca
- To see status of migrations: `make migration_status NAME=vulcanize_public`
* See below for configuring additional environments
In some cases (such as recent Ubuntu systems), it may be necessary to overcome failures of password authentication from localhost. To allow access on Ubuntu, set localhost connections via hostname, ipv4, and ipv6 from peer/md5 to trust in: /etc/postgresql/<version>/pg_hba.conf
### Create a migration file
1. `make new_migration NAME=add_columnA_to_table1`
- This will create a new timestamped migration file in `db/migrations`
1. Write the migration code in the created file, under the respective `goose` pragma
- Goose automatically runs each migration in a transaction; don't add `BEGIN` and `COMMIT` statements.
(It should be noted that trusted auth should only be enabled on systems without sensitive data in them: development and local test databases)
### Configuration
### Configuring a synced Ethereum node
- To use a local Ethereum node, copy `environments/public.toml.example` to
`environments/public.toml` and update the `ipcPath` and `levelDbPath`.
- `ipcPath` should match the local node's IPC filepath:
@ -83,378 +96,41 @@ Lastly, ensure that `GOPATH` is defined in your shell. If necessary, `GOPATH` ca
- Linux: `<full home path>/ethereum/geth/chaindata`
- `levelDbPath` is irrelevant (and `coldImport` is currently unavailable) if only running parity.
- See `environments/infura.toml` to configure commands to run against infura, if a local node is unavailable.
- Copy `environments/local.toml.example` to `environments/local.toml` to configure commands to run against a local node such as [Ganache](https://truffleframework.com/ganache) or [ganache-cli](https://github.com/trufflesuite/ganache-clihttps://github.com/trufflesuite/ganache-cli).
### Start syncing with postgres
Syncs VulcanizeDB with the configured Ethereum node, populating blocks, transactions, receipts, and logs.
This command is useful when you want to maintain a broad cache of what's happening on the blockchain.
1. Start Ethereum node (**if fast syncing your Ethereum node, wait for initial sync to finish**)
1. In a separate terminal start VulcanizeDB:
- `./vulcanizedb sync --config <config.toml> --starting-block-number <block-number>`
## Usage
Usage is broken up into two processes:
### Alternatively, sync from Geth's underlying LevelDB
Sync VulcanizeDB from the LevelDB underlying a Geth node.
1. Assure node is not running, and that it has synced to the desired block height.
1. Start vulcanize_db
- `./vulcanizedb coldImport --config <config.toml>`
1. Optional flags:
- `--starting-block-number <block number>`/`-s <block number>`: block number to start syncing from
- `--ending-block-number <block number>`/`-e <block number>`: block number to sync to
- `--all`/`-a`: sync all missing blocks
### Data syncing
To provide data for transformations, raw Ethereum data must first be synced into vDB.
This is accomplished through the use of the `lightSync`, `sync`, or `coldImport` commands.
These commands are described in detail [here](../staging/documentation/sync.md).
### Alternatively, sync in "light" mode
Syncs VulcanizeDB with the configured Ethereum node, populating only block headers.
This command is useful when you want a minimal baseline from which to track targeted data on the blockchain (e.g. individual smart contract storage values).
1. Start Ethereum node
1. In a separate terminal start VulcanizeDB:
- `./vulcanizedb lightSync --config <config.toml> --starting-block-number <block-number>`
### Data transformation
Contract watchers use the raw data that has been synced into Postgres to filter out and apply transformations to specific data of interest.
## Start full environment in docker by single command
There is a built-in `contractWatcher` command which provides generic transformation of most contract data. This command is described in detail [here](../staging/documentation/contractWatcher.md).
### Geth Rinkeby
In many cases a custom transformer or set of transformers will need to be written to provide complete or more comprehensive coverage or to optimize other aspects of the output for a specific end-use.
In this case we have provided the `compose`, `execute`, and `composeAndExecute` commands for running custom transformers from external repositories. This is described in detail [here](../staging/documentation/composeAndExecute.md).
make command | description
------------------- | ----------------
rinkeby_env_up | start geth, postgres and rolling migrations, after migrations done starting vulcanizedb container
rinkeby_env_deploy | build and run vulcanizedb container in rinkeby environment
rinkeby_env_migrate | build and run rinkeby env migrations
rinkeby_env_down | stop and remove all rinkeby env containers
Success run of the VulcanizeDB container require full geth state sync,
attach to geth console and check sync state:
```bash
$ docker exec -it rinkeby_vulcanizedb_geth geth --rinkeby attach
...
> eth.syncing
false
```
If you have full rinkeby chaindata you can move it to `rinkeby_vulcanizedb_geth_data` docker volume to skip long wait of sync.
## Running the Tests
- Replace the empty `ipcPath` in the `environments/infura.toml` with a path to a full archival node's eth_jsonrpc endpoint (e.g. local geth node ipc path or infura url)
## Tests
- Replace the empty `ipcPath` in the `environments/infura.toml` with a path to a full node's eth_jsonrpc endpoint (e.g. local geth node ipc path or infura url)
- Note: integration tests require configuration with an archival node
- `createdb vulcanize_private` will create the test db
- `make migrate NAME=vulcanize_private` will run the db migrations
- `make test` will run the unit tests and skip the integration tests
- `make integrationtest` will run the just the integration tests
- `make integrationtest` will run just the integration tests
## Deploying
1. you will need to make sure you have ssh agent running and your ssh key added to it. instructions [here](https://developer.github.com/v3/guides/using-ssh-agent-forwarding/#your-key-must-be-available-to-ssh-agent)
1. `go get -u github.com/pressly/sup/cmd/sup`
1. `sup staging deploy`
## Contract Watchers
Contract watchers work with a light or full sync vDB to fetch raw ethereum data and execute a set of transformations over them, persisting the output.
A watcher is composed of at least a fetcher and a transformer or set of transformers, where a fetcher is an interface for retrieving raw Ethereum data from some source (e.g. eth_jsonrpc, IPFS)
and a transformer is an interface for filtering through that raw Ethereum data to extract, process, and persist data for specific contracts or accounts.
## contractWatcher
The `contractWatcher` command is a built-in generic contract watcher. It can watch any and all events for a given contract provided the contract's ABI is available.
It also provides some state variable coverage by automating polling of public methods, with some restrictions:
1. The method must have 2 or less arguments
1. The method's arguments must all be of type address or bytes32 (hash)
1. The method must return a single value
This command operates in two modes- `light` and `full`- which require a light or full-synced vulcanizeDB, respectively.
This command requires the contract ABI be available on Etherscan if it is not provided in the config file by the user.
If method polling is turned on we require an archival node at the ETH ipc endpoint in our config, whether or not we are operating in `light` or `full` mode.
Otherwise, when operating in `light` mode, we only need to connect to a full node to fetch event logs.
This command takes a config of the form:
```toml
[database]
name = "vulcanize_public"
hostname = "localhost"
port = 5432
[client]
ipcPath = "/Users/user/Library/Ethereum/geth.ipc"
[contract]
network = ""
addresses = [
"contractAddress1",
"contractAddress2"
]
[contract.contractAddress1]
abi = 'ABI for contract 1'
startingBlock = 982463
[contract.contractAddress2]
abi = 'ABI for contract 2'
events = [
"event1",
"event2"
]
eventArgs = [
"arg1",
"arg2"
]
methods = [
"method1",
"method2"
]
methodArgs = [
"arg1",
"arg2"
]
startingBlock = 4448566
piping = true
````
- The `contract` section defines which contracts we want to watch and with which conditions.
- `network` is only necessary if the ABIs are not provided and wish to be fetched from Etherscan.
- Empty or nil string indicates mainnet
- "ropsten", "kovan", and "rinkeby" indicate their respective networks
- `addresses` lists the contract addresses we are watching and is used to load their individual configuration parameters
- `contract.<contractAddress>` are the sub-mappings which contain the parameters specific to each contract address
- `abi` is the ABI for the contract; if none is provided the application will attempt to fetch one from Etherscan using the provided address and network
- `events` is the list of events to watch
- If this field is omitted or no events are provided then by defualt ALL events extracted from the ABI will be watched
- If event names are provided then only those events will be watched
- `eventArgs` is the list of arguments to filter events with
- If this field is omitted or no eventArgs are provided then by default watched events are not filtered by their argument values
- If eventArgs are provided then only those events which emit at least one of these values as an argument are watched
- `methods` is the list of methods to poll
- If this is omitted or no methods are provided then by default NO methods are polled
- If method names are provided then those methods will be polled, provided
1) Method has two or less arguments
1) Arguments are all of address or hash types
1) Method returns a single value
- `methodArgs` is the list of arguments to limit polling methods to
- If this field is omitted or no methodArgs are provided then by default methods will be polled with every combination of the appropriately typed values that have been collected from watched events
- If methodArgs are provided then only those values will be used to poll methods
- `startingBlock` is the block we want to begin watching the contract, usually the deployment block of that contract
- `piping` is a boolean flag which indicates whether or not we want to pipe return method values forward as arguments to subsequent method calls
At the very minimum, for each contract address an ABI and a starting block number need to be provided (or just the starting block if the ABI can be reliably fetched from Etherscan).
With just this information we will be able to watch all events at the contract, but with no additional filters and no method polling.
### contractWatcher output
Transformed events and polled method results are committed to Postgres in schemas and tables generated according to the contract abi.
Schemas are created for each contract using the naming convention `<sync-type>_<lowercase contract-address>`
Under this schema, tables are generated for watched events as `<lowercase event name>_event` and for polled methods as `<lowercase method name>_method`
The 'method' and 'event' identifiers are tacked onto the end of the table names to prevent collisions between methods and events of the same lowercase name
### contractWatcher example:
Modify `./environments/example.toml` to replace the empty `ipcPath` with a path that points to an ethjson_rpc endpoint (e.g. a local geth node ipc path or an Infura url).
This endpoint should be for an archival eth node if we want to perform method polling as this configuration is currently set up to do. To work with a non-archival full node,
remove the `balanceOf` method from the `0x8dd5fbce2f6a956c3022ba3663759011dd51e73e` (TrueUSD) contract.
If you are operating a light sync vDB, run:
`./vulcanizedb contractWatcher --config=./environments/example.toml --mode=light`
If instead you are operating a full sync vDB and provided an archival node IPC path, run in full mode:
`./vulcanizedb contractWatcher --config=./environments/example.toml --mode=full`
This will run the contractWatcher and configures it to watch the contracts specified in the config file. Note that
by default we operate in `light` mode but the flag is included here to demonstrate its use.
The example config we link to in this example watches two contracts, the ENS Registry (0x314159265dD8dbb310642f98f50C066173C1259b) and TrueUSD (0x8dd5fbCe2F6a956C3022bA3663759011Dd51e73E).
Because the ENS Registry is configured with only an ABI and a starting block, we will watch all events for this contract and poll none of its methods. Note that the ENS Registry is an example
of a contract which does not have its ABI available over Etherscan and must have it included in the config file.
The TrueUSD contract is configured with two events (`Transfer` and `Mint`) and a single method (`balanceOf`), as such it will watch these two events and use any addresses it collects emitted from them
to poll the `balanceOf` method with those addresses at every block. Note that we do not provide an ABI for TrueUSD as its ABI can be fetched from Etherscan.
For the ENS contract, it produces and populates a schema with four tables"
`light_0x314159265dd8dbb310642f98f50c066173c1259b.newowner_event`
`light_0x314159265dd8dbb310642f98f50c066173c1259b.newresolver_event`
`light_0x314159265dd8dbb310642f98f50c066173c1259b.newttl_event`
`light_0x314159265dd8dbb310642f98f50c066173c1259b.transfer_event`
For the TrusUSD contract, it produces and populates a schema with three tables:
`light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event`
`light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.mint_event`
`light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method`
Column ids and types for these tables are generated based on the event and method argument names and types and method return types, resulting in tables such as:
Table "light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event"
| Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
|:----------:|:---------------------:|:---------:|:--------:|:-------------------------------------------------------------------------------------------:|:--------:|:------------:|:-----------:|
| id | integer | | not null | nextval('light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event_id_seq'::regclass) | plain | | |
| header_id | integer | | not null | | plain | | |
| token_name | character varying(66) | | not null | | extended | | |
| raw_log | jsonb | | | | extended | | |
| log_idx | integer | | not null | | plain | | |
| tx_idx | integer | | not null | | plain | | |
| from_ | character varying(66) | | not null | | extended | | |
| to_ | character varying(66) | | not null | | extended | | |
| value_ | numeric | | not null | | main | | |
## API
[Postgraphile](https://www.graphile.org/postgraphile/) is used to expose GraphQL endpoints for our database schemas, this is described in detail [here](../staging/postgraphile/README.md).
Table "light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method"
## Contributing
Contributions are welcome! For more on this, please see [here](../staging/documentation/contributing.md).
| Column | Type | Collation | Nullable | Default | Storage | Stats target | Description |
|:----------:|:---------------------:|:---------:|:--------:|:-------------------------------------------------------------------------------------------:|:--------:|:------------:|:-----------:|
| id | integer | | not null | nextval('light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method_id_seq'::regclass) | plain | | |
| token_name | character varying(66) | | not null | | extended | | |
| block | integer | | not null | | plain | | |
| who_ | character varying(66) | | not null | | extended | | |
| returned | numeric | | not null | | main | | |
The addition of '_' after table names is to prevent collisions with reserved Postgres words.
Small note: If editing the Readme, please conform to the [standard-readme specification](https://github.com/RichardLitt/standard-readme).
Also notice that the contract address used for the schema name has been down-cased.
## composeAndExecute
The `composeAndExecute` command is used to compose and execute over an arbitrary set of custom transformers.
This is accomplished by generating a Go pluggin which allows our `vulcanizedb` binary to link to external transformers, so
long as they abide by our standard [interfaces](../staging/libraries/shared/transformer).
This command requires Go 1.11+ and [Go plugins](https://golang.org/pkg/plugin/) only work on Unix based systems.
### Writing custom transformers
Storage Transformers
* [Guide](../staging/libraries/shared/factories/storage/README.md)
* [Example](../staging/libraries/shared/factories/storage/EXAMPLE.md)
Event Transformers
* [Guide](../staging/libraries/shared/factories/event/README.md)
* [Example](https://github.com/vulcanize/ens_transformers/tree/working)
### composeAndExecute configuration
A .toml config file is specified when executing the command:
`./vulcanizedb composeAndExecute --config=./environments/config_name.toml`
The config provides information for composing a set of transformers:
```toml
[database]
name = "vulcanize_public"
hostname = "localhost"
user = "vulcanize"
password = "vulcanize"
port = 5432
[client]
ipcPath = "/Users/user/Library/Ethereum/geth.ipc"
[exporter]
home = "github.com/vulcanize/vulcanizedb"
name = "exampleTransformerExporter"
save = false
transformerNames = [
"transformer1",
"transformer2",
"transformer3",
"transformer4",
]
[exporter.transformer1]
path = "path/to/transformer1"
type = "eth_event"
repository = "github.com/account/repo"
migrations = "db/migrations"
rank = "0"
[exporter.transformer2]
path = "path/to/transformer2"
type = "eth_contract"
repository = "github.com/account/repo"
migrations = "db/migrations"
rank = "0"
[exporter.transformer3]
path = "path/to/transformer3"
type = "eth_event"
repository = "github.com/account/repo"
migrations = "db/migrations"
rank = "0"
[exporter.transformer4]
path = "path/to/transformer4"
type = "eth_storage"
repository = "github.com/account2/repo2"
migrations = "to/db/migrations"
rank = "1"
```
- `home` is the name of the package you are building the plugin for, in most cases this is github.com/vulcanize/vulcanizedb
- `name` is the name used for the plugin files (.so and .go)
- `save` indicates whether or not the user wants to save the .go file instead of removing it after .so compilation. Sometimes useful for debugging/trouble-shooting purposes.
- `transformerNames` is the list of the names of the transformers we are composing together, so we know how to access their submaps in the exporter map
- `exporter.<transformerName>`s are the sub-mappings containing config info for the transformers
- `repository` is the path for the repository which contains the transformer and its `TransformerInitializer`
- `path` is the relative path from `repository` to the transformer's `TransformerInitializer` directory (initializer package).
- Transformer repositories need to be cloned into the user's $GOPATH (`go get`)
- `type` is the type of the transformer; indicating which type of watcher it works with (for now, there are only two options: `eth_event` and `eth_storage`)
- `eth_storage` indicates the transformer works with the [storage watcher](../staging/libraries/shared/watcher/storage_watcher.go)
that fetches state and storage diffs from an ETH node (instead of, for example, from IPFS)
- `eth_event` indicates the transformer works with the [event watcher](../staging/libraries/shared/watcher/event_watcher.go)
that fetches event logs from an ETH node
- `eth_contract` indicates the transformer works with the [contract watcher](../staging/libraries/shared/watcher/contract_watcher.go)
that is made to work with [contract_watcher pkg](../staging/pkg/contract_watcher)
based transformers which work with either a light or full sync vDB to watch events and poll public methods ([example](https://github.com/vulcanize/ens_transformers/blob/working/transformers/domain_records/transformer.go))
- `migrations` is the relative path from `repository` to the db migrations directory for the transformer
- `rank` determines the order that migrations are ran, with lower ranked migrations running first
- this is to help isolate any potential conflicts between transformer migrations
- start at "0"
- use strings
- don't leave gaps
- transformers with identical migrations/migration paths should share the same rank
- Note: If any of the imported transformers need additional config variables those need to be included as well
This information is used to write and build a Go plugin which exports the configured transformers.
These transformers are loaded onto their specified watchers and executed.
Transformers of different types can be run together in the same command using a single config file or in separate instances using different config files
The general structure of a plugin .go file, and what we would see built with the above config is shown below
```go
package main
import (
interface1 "github.com/vulcanize/vulcanizedb/libraries/shared/transformer"
transformer1 "github.com/account/repo/path/to/transformer1"
transformer2 "github.com/account/repo/path/to/transformer2"
transformer3 "github.com/account/repo/path/to/transformer3"
transformer4 "github.com/account2/repo2/path/to/transformer4"
)
type exporter string
var Exporter exporter
func (e exporter) Export() []interface1.EventTransformerInitializer, []interface1.StorageTransformerInitializer, []interface1.ContractTransformerInitializer {
return []interface1.TransformerInitializer{
transformer1.TransformerInitializer,
transformer3.TransformerInitializer,
}, []interface1.StorageTransformerInitializer{
transformer4.StorageTransformerInitializer,
}, []interface1.ContractTransformerInitializer{
transformer2.TransformerInitializer,
}
}
```
### Preparing transformers to work as pluggins for composeAndExecute
To plug in an external transformer we need to:
* Create a [package](https://github.com/vulcanize/ens_transformers/blob/working/transformers/registry/new_owner/initializer/initializer.go)
that exports a variable `TransformerInitializer`, `StorageTransformerInitializer`, or `ContractTransformerInitializer` that are of type [TransformerInitializer](../staging/libraries/shared/transformer/event_transformer.go#L33)
or [StorageTransformerInitializer](../staging/libraries/shared/transformer/storage_transformer.go#L31),
or [ContractTransformerInitializer](../staging/libraries/shared/transformer/contract_transformer.go#L31), respectively
* Design the transformers to work in the context of their [event](../staging/libraries/shared/watcher/event_watcher.go#L83),
[storage](../staging/libraries/shared/watcher/storage_watcher.go#L53),
or [contract](../staging/libraries/shared/watcher/contract_watcher.go#L68) watcher execution modes
* Create db migrations to run against vulcanizeDB so that we can store the transformer output
* Do not `goose fix` the transformer migrations
* Specify migration locations for each transformer in the config with the `exporter.transformer.migrations` fields
* If the base vDB migrations occupy this path as well, they need to be in their `goose fix`ed form
as they are [here](https://github.com/vulcanize/vulcanizedb/tree/master/db/migrations)
To update a plugin repository with changes to the core vulcanizedb repository, replace the vulcanizedb vendored in the plugin repo (`plugin_repo/vendor/github.com/vulcanize/vulcanizedb`)
with the newly updated version
* The entire vendor lib within the vendored vulcanizedb needs to be deleted (`plugin_repo/vendor/github.com/vulcanize/vulcanizedb/vendor`)
* These complications arise due to this [conflict](https://github.com/golang/go/issues/20481) between `dep` and Go plugins
## License
[AGPL-3.0](../staging/LICENSE) © Vulcanize Inc

View File

@ -2,8 +2,8 @@
-- PostgreSQL database dump
--
-- Dumped from database version 10.5
-- Dumped by pg_dump version 10.4
-- Dumped from database version 10.6
-- Dumped by pg_dump version 10.6
SET statement_timeout = 0;
SET lock_timeout = 0;
@ -523,43 +523,6 @@ CREATE SEQUENCE public.uncles_id_seq
ALTER SEQUENCE public.uncles_id_seq OWNED BY public.uncles.id;
--
-- Name: uncles; Type: TABLE; Schema: public; Owner: -
--
CREATE TABLE public.uncles (
id integer NOT NULL,
hash character varying(66) NOT NULL,
block_id integer NOT NULL,
reward numeric NOT NULL,
miner character varying(42) NOT NULL,
raw jsonb,
block_timestamp numeric,
eth_node_id integer NOT NULL,
eth_node_fingerprint character varying(128)
);
--
-- Name: uncles_id_seq; Type: SEQUENCE; Schema: public; Owner: -
--
CREATE SEQUENCE public.uncles_id_seq
AS integer
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
--
-- Name: uncles_id_seq; Type: SEQUENCE OWNED BY; Schema: public; Owner: -
--
ALTER SEQUENCE public.uncles_id_seq OWNED BY public.uncles.id;
--
-- Name: watched_contracts; Type: TABLE; Schema: public; Owner: -
--
@ -705,13 +668,6 @@ ALTER TABLE ONLY public.queued_storage ALTER COLUMN id SET DEFAULT nextval('publ
ALTER TABLE ONLY public.uncles ALTER COLUMN id SET DEFAULT nextval('public.uncles_id_seq'::regclass);
--
-- Name: uncles id; Type: DEFAULT; Schema: public; Owner: -
--
ALTER TABLE ONLY public.uncles ALTER COLUMN id SET DEFAULT nextval('public.uncles_id_seq'::regclass);
--
-- Name: watched_contracts contract_id; Type: DEFAULT; Schema: public; Owner: -
--
@ -871,22 +827,6 @@ ALTER TABLE ONLY public.uncles
ADD CONSTRAINT uncles_pkey PRIMARY KEY (id);
--
-- Name: uncles uncles_block_id_hash_key; Type: CONSTRAINT; Schema: public; Owner: -
--
ALTER TABLE ONLY public.uncles
ADD CONSTRAINT uncles_block_id_hash_key UNIQUE (block_id, hash);
--
-- Name: uncles uncles_pkey; Type: CONSTRAINT; Schema: public; Owner: -
--
ALTER TABLE ONLY public.uncles
ADD CONSTRAINT uncles_pkey PRIMARY KEY (id);
--
-- Name: watched_contracts watched_contracts_contract_hash_key; Type: CONSTRAINT; Schema: public; Owner: -
--
@ -1033,22 +973,6 @@ ALTER TABLE ONLY public.uncles
ADD CONSTRAINT uncles_eth_node_id_fkey FOREIGN KEY (eth_node_id) REFERENCES public.eth_nodes(id) ON DELETE CASCADE;
--
-- Name: uncles uncles_block_id_fkey; Type: FK CONSTRAINT; Schema: public; Owner: -
--
ALTER TABLE ONLY public.uncles
ADD CONSTRAINT uncles_block_id_fkey FOREIGN KEY (block_id) REFERENCES public.blocks(id) ON DELETE CASCADE;
--
-- Name: uncles uncles_eth_node_id_fkey; Type: FK CONSTRAINT; Schema: public; Owner: -
--
ALTER TABLE ONLY public.uncles
ADD CONSTRAINT uncles_eth_node_id_fkey FOREIGN KEY (eth_node_id) REFERENCES public.eth_nodes(id) ON DELETE CASCADE;
--
-- PostgreSQL database dump complete
--

View File

@ -0,0 +1,152 @@
# composeAndExecute
The `composeAndExecute` command is used to compose and execute over an arbitrary set of custom transformers.
This is accomplished by generating a Go pluggin which allows the `vulcanizedb` binary to link to external transformers, so
long as they abide by one of the standard [interfaces](../staging/libraries/shared/transformer).
This command requires Go 1.11+ and [Go plugins](https://golang.org/pkg/plugin/) only work on Unix-based systems.
## Writing custom transformers
Storage Transformers
* [Guide](../../staging/libraries/shared/factories/storage/README.md)
* [Example](../../staging/libraries/shared/factories/storage/EXAMPLE.md)
Event Transformers
* [Guide](../../staging/libraries/shared/factories/event/README.md)
* [Example 1](https://github.com/vulcanize/ens_transformers/tree/master/transformers/registar)
* [Example 2](https://github.com/vulcanize/ens_transformers/tree/master/transformers/registry)
* [Example 3](https://github.com/vulcanize/ens_transformers/tree/master/transformers/resolver)
Contract Transformers
* [Example 1](https://github.com/vulcanize/account_transformers)
* [Example 2](https://github.com/vulcanize/ens_transformers/tree/master/transformers/domain_records)
## Preparing transformers to work as a plugin for composeAndExecute
To plug in an external transformer we need to:
1. Create a package that exports a variable `TransformerInitializer`, `StorageTransformerInitializer`, or `ContractTransformerInitializer` that are of type [TransformerInitializer](../staging/libraries/shared/transformer/event_transformer.go#L33)
or [StorageTransformerInitializer](../../staging/libraries/shared/transformer/storage_transformer.go#L31),
or [ContractTransformerInitializer](../../staging/libraries/shared/transformer/contract_transformer.go#L31), respectively
2. Design the transformers to work in the context of their [event](../staging/libraries/shared/watcher/event_watcher.go#L83),
[storage](../../staging/libraries/shared/watcher/storage_watcher.go#L53),
or [contract](../../staging/libraries/shared/watcher/contract_watcher.go#L68) watcher execution modes
3. Create db migrations to run against vulcanizeDB so that we can store the transformer output
* Do not `goose fix` the transformer migrations, this is to ensure they are always ran after the core vulcanizedb migrations which are kept in their fixed form
* Specify migration locations for each transformer in the config with the `exporter.transformer.migrations` fields
* If the base vDB migrations occupy this path as well, they need to be in their `goose fix`ed form
as they are [here](../../staging/db/migrations)
To update a plugin repository with changes to the core vulcanizedb repository, replace the vulcanizedb vendored in the plugin repo (`plugin_repo/vendor/github.com/vulcanize/vulcanizedb`)
with the newly updated version
* The entire vendor lib within the vendored vulcanizedb needs to be deleted (`plugin_repo/vendor/github.com/vulcanize/vulcanizedb/vendor`)
* These complications arise due to this [conflict](https://github.com/golang/go/issues/20481) between `dep` and Go plugins
## Configuration
A .toml config file is specified when executing the command:
`./vulcanizedb composeAndExecute --config=./environments/config_name.toml`
The config provides information for composing a set of transformers:
```toml
[database]
name = "vulcanize_public"
hostname = "localhost"
user = "vulcanize"
password = "vulcanize"
port = 5432
[client]
ipcPath = "/Users/user/Library/Ethereum/geth.ipc"
[exporter]
home = "github.com/vulcanize/vulcanizedb"
name = "exampleTransformerExporter"
save = false
transformerNames = [
"transformer1",
"transformer2",
"transformer3",
"transformer4",
]
[exporter.transformer1]
path = "path/to/transformer1"
type = "eth_event"
repository = "github.com/account/repo"
migrations = "db/migrations"
rank = "0"
[exporter.transformer2]
path = "path/to/transformer2"
type = "eth_contract"
repository = "github.com/account/repo"
migrations = "db/migrations"
rank = "0"
[exporter.transformer3]
path = "path/to/transformer3"
type = "eth_event"
repository = "github.com/account/repo"
migrations = "db/migrations"
rank = "0"
[exporter.transformer4]
path = "path/to/transformer4"
type = "eth_storage"
repository = "github.com/account2/repo2"
migrations = "to/db/migrations"
rank = "1"
```
- `home` is the name of the package you are building the plugin for, in most cases this is github.com/vulcanize/vulcanizedb
- `name` is the name used for the plugin files (.so and .go)
- `save` indicates whether or not the user wants to save the .go file instead of removing it after .so compilation. Sometimes useful for debugging/trouble-shooting purposes.
- `transformerNames` is the list of the names of the transformers we are composing together, so we know how to access their submaps in the exporter map
- `exporter.<transformerName>`s are the sub-mappings containing config info for the transformers
- `repository` is the path for the repository which contains the transformer and its `TransformerInitializer`
- `path` is the relative path from `repository` to the transformer's `TransformerInitializer` directory (initializer package).
- Transformer repositories need to be cloned into the user's $GOPATH (`go get`)
- `type` is the type of the transformer; indicating which type of watcher it works with (for now, there are only two options: `eth_event` and `eth_storage`)
- `eth_storage` indicates the transformer works with the [storage watcher](../../staging/libraries/shared/watcher/storage_watcher.go)
that fetches state and storage diffs from an ETH node (instead of, for example, from IPFS)
- `eth_event` indicates the transformer works with the [event watcher](../../staging/libraries/shared/watcher/event_watcher.go)
that fetches event logs from an ETH node
- `eth_contract` indicates the transformer works with the [contract watcher](../staging/libraries/shared/watcher/contract_watcher.go)
that is made to work with [contract_watcher pkg](../../staging/pkg/contract_watcher)
based transformers which work with either a light or full sync vDB to watch events and poll public methods ([example1](https://github.com/vulcanize/account_transformers/tree/master/transformers/account/light), [example2](https://github.com/vulcanize/ens_transformers/tree/working/transformers/domain_records))
- `migrations` is the relative path from `repository` to the db migrations directory for the transformer
- `rank` determines the order that migrations are ran, with lower ranked migrations running first
- this is to help isolate any potential conflicts between transformer migrations
- start at "0"
- use strings
- don't leave gaps
- transformers with identical migrations/migration paths should share the same rank
- Note: If any of the imported transformers need additional config variables those need to be included as well
This information is used to write and build a Go plugin which exports the configured transformers.
These transformers are loaded onto their specified watchers and executed.
Transformers of different types can be run together in the same command using a single config file or in separate instances using different config files
The general structure of a plugin .go file, and what we would see built with the above config is shown below
```go
package main
import (
interface1 "github.com/vulcanize/vulcanizedb/libraries/shared/transformer"
transformer1 "github.com/account/repo/path/to/transformer1"
transformer2 "github.com/account/repo/path/to/transformer2"
transformer3 "github.com/account/repo/path/to/transformer3"
transformer4 "github.com/account2/repo2/path/to/transformer4"
)
type exporter string
var Exporter exporter
func (e exporter) Export() []interface1.EventTransformerInitializer, []interface1.StorageTransformerInitializer, []interface1.ContractTransformerInitializer {
return []interface1.TransformerInitializer{
transformer1.TransformerInitializer,
transformer3.TransformerInitializer,
}, []interface1.StorageTransformerInitializer{
transformer4.StorageTransformerInitializer,
}, []interface1.ContractTransformerInitializer{
transformer2.TransformerInitializer,
}
}
```

View File

@ -0,0 +1,160 @@
# contractWatcher
The `contractWatcher` command is a built-in generic contract watcher. It can watch any and all events for a given contract provided the contract's ABI is available.
It also provides some state variable coverage by automating polling of public methods, with some restrictions:
1. The method must have 2 or less arguments
1. The method's arguments must all be of type address or bytes32 (hash)
1. The method must return a single value
This command operates in two modes- `light` and `full`- which require a light or full-synced vulcanizeDB, respectively.
This command requires the contract ABI be available on Etherscan if it is not provided in the config file by the user.
If method polling is turned on we require an archival node at the ETH ipc endpoint in our config, whether or not we are operating in `light` or `full` mode.
Otherwise we only need to connect to a full node.
## Configuration
This command takes a config of the form:
```toml
[database]
name = "vulcanize_public"
hostname = "localhost"
port = 5432
[client]
ipcPath = "/Users/user/Library/Ethereum/geth.ipc"
[contract]
network = ""
addresses = [
"contractAddress1",
"contractAddress2"
]
[contract.contractAddress1]
abi = 'ABI for contract 1'
startingBlock = 982463
[contract.contractAddress2]
abi = 'ABI for contract 2'
events = [
"event1",
"event2"
]
eventArgs = [
"arg1",
"arg2"
]
methods = [
"method1",
"method2"
]
methodArgs = [
"arg1",
"arg2"
]
startingBlock = 4448566
piping = true
````
- The `contract` section defines which contracts we want to watch and with which conditions.
- `network` is only necessary if the ABIs are not provided and wish to be fetched from Etherscan.
- Empty or nil string indicates mainnet
- "ropsten", "kovan", and "rinkeby" indicate their respective networks
- `addresses` lists the contract addresses we are watching and is used to load their individual configuration parameters
- `contract.<contractAddress>` are the sub-mappings which contain the parameters specific to each contract address
- `abi` is the ABI for the contract; if none is provided the application will attempt to fetch one from Etherscan using the provided address and network
- `events` is the list of events to watch
- If this field is omitted or no events are provided then by defualt ALL events extracted from the ABI will be watched
- If event names are provided then only those events will be watched
- `eventArgs` is the list of arguments to filter events with
- If this field is omitted or no eventArgs are provided then by default watched events are not filtered by their argument values
- If eventArgs are provided then only those events which emit at least one of these values as an argument are watched
- `methods` is the list of methods to poll
- If this is omitted or no methods are provided then by default NO methods are polled
- If method names are provided then those methods will be polled, provided
1) Method has two or less arguments
1) Arguments are all of address or hash types
1) Method returns a single value
- `methodArgs` is the list of arguments to limit polling methods to
- If this field is omitted or no methodArgs are provided then by default methods will be polled with every combination of the appropriately typed values that have been collected from watched events
- If methodArgs are provided then only those values will be used to poll methods
- `startingBlock` is the block we want to begin watching the contract, usually the deployment block of that contract
- `piping` is a boolean flag which indicates whether or not we want to pipe return method values forward as arguments to subsequent method calls
At the very minimum, for each contract address an ABI and a starting block number need to be provided (or just the starting block if the ABI can be reliably fetched from Etherscan).
With just this information we will be able to watch all events at the contract, but with no additional filters and no method polling.
## Output
Transformed events and polled method results are committed to Postgres in schemas and tables generated according to the contract abi.
Schemas are created for each contract using the naming convention `<sync-type>_<lowercase contract-address>`
Under this schema, tables are generated for watched events as `<lowercase event name>_event` and for polled methods as `<lowercase method name>_method`
The 'method' and 'event' identifiers are tacked onto the end of the table names to prevent collisions between methods and events of the same lowercase name
## Example:
Modify `./environments/example.toml` to replace the empty `ipcPath` with a path that points to an ethjson_rpc endpoint (e.g. a local geth node ipc path or an Infura url).
This endpoint should be for an archival eth node if we want to perform method polling as this configuration is currently set up to do. To work with a non-archival full node,
remove the `balanceOf` method from the `0x8dd5fbce2f6a956c3022ba3663759011dd51e73e` (TrueUSD) contract.
If you are operating a light sync vDB, run:
`./vulcanizedb contractWatcher --config=./environments/example.toml --mode=light`
If instead you are operating a full sync vDB and provided an archival node IPC path, run in full mode:
`./vulcanizedb contractWatcher --config=./environments/example.toml --mode=full`
This will run the contractWatcher and configures it to watch the contracts specified in the config file. Note that
by default we operate in `light` mode but the flag is included here to demonstrate its use.
The example config we link to in this example watches two contracts, the ENS Registry (0x314159265dD8dbb310642f98f50C066173C1259b) and TrueUSD (0x8dd5fbCe2F6a956C3022bA3663759011Dd51e73E).
Because the ENS Registry is configured with only an ABI and a starting block, we will watch all events for this contract and poll none of its methods. Note that the ENS Registry is an example
of a contract which does not have its ABI available over Etherscan and must have it included in the config file.
The TrueUSD contract is configured with two events (`Transfer` and `Mint`) and a single method (`balanceOf`), as such it will watch these two events and use any addresses it collects emitted from them
to poll the `balanceOf` method with those addresses at every block. Note that we do not provide an ABI for TrueUSD as its ABI can be fetched from Etherscan.
For the ENS contract, it produces and populates a schema with four tables"
`light_0x314159265dd8dbb310642f98f50c066173c1259b.newowner_event`
`light_0x314159265dd8dbb310642f98f50c066173c1259b.newresolver_event`
`light_0x314159265dd8dbb310642f98f50c066173c1259b.newttl_event`
`light_0x314159265dd8dbb310642f98f50c066173c1259b.transfer_event`
For the TrusUSD contract, it produces and populates a schema with three tables:
`light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event`
`light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.mint_event`
`light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method`
Column ids and types for these tables are generated based on the event and method argument names and types and method return types, resulting in tables such as:
Table "light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event"
| Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
|:----------:|:---------------------:|:---------:|:--------:|:-------------------------------------------------------------------------------------------:|:--------:|:------------:|:-----------:|
| id | integer | | not null | nextval('light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event_id_seq'::regclass) | plain | | |
| header_id | integer | | not null | | plain | | |
| token_name | character varying(66) | | not null | | extended | | |
| raw_log | jsonb | | | | extended | | |
| log_idx | integer | | not null | | plain | | |
| tx_idx | integer | | not null | | plain | | |
| from_ | character varying(66) | | not null | | extended | | |
| to_ | character varying(66) | | not null | | extended | | |
| value_ | numeric | | not null | | main | | |
Table "light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method"
| Column | Type | Collation | Nullable | Default | Storage | Stats target | Description |
|:----------:|:---------------------:|:---------:|:--------:|:-------------------------------------------------------------------------------------------:|:--------:|:------------:|:-----------:|
| id | integer | | not null | nextval('light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method_id_seq'::regclass) | plain | | |
| token_name | character varying(66) | | not null | | extended | | |
| block | integer | | not null | | plain | | |
| who_ | character varying(66) | | not null | | extended | | |
| returned | numeric | | not null | | main | | |
The addition of '_' after table names is to prevent collisions with reserved Postgres words.
Also notice that the contract address used for the schema name has been down-cased.

View File

@ -0,0 +1,11 @@
# Contribution guidelines
Contributions are welcome! In addition to core contributions, developers are encouraged to build their own custom transformers which
can be run together with other custom transformers using the [composeAndExeucte](../../staging/documentation/composeAndExecute.md) command.
## Creating a new migration file
1. `make new_migration NAME=add_columnA_to_table1`
- This will create a new timestamped migration file in `db/migrations`
1. Write the migration code in the created file, under the respective `goose` pragma
- Goose automatically runs each migration in a transaction; don't add `BEGIN` and `COMMIT` statements.
1. Core migrations should be committed in their `goose fix`ed form.

26
documentation/sync.md Normal file
View File

@ -0,0 +1,26 @@
# Syncing commands
These commands are used to sync raw Ethereum data into Postgres.
## lightSync
Syncs VulcanizeDB with the configured Ethereum node, populating only block headers.
This command is useful when you want a minimal baseline from which to track targeted data on the blockchain (e.g. individual smart contract storage values or event logs).
1. Start Ethereum node
1. In a separate terminal start VulcanizeDB:
- `./vulcanizedb lightSync --config <config.toml> --starting-block-number <block-number>`
## sync
Syncs VulcanizeDB with the configured Ethereum node, populating blocks, transactions, receipts, and logs.
This command is useful when you want to maintain a broad cache of what's happening on the blockchain.
1. Start Ethereum node (**if fast syncing your Ethereum node, wait for initial sync to finish**)
1. In a separate terminal start VulcanizeDB:
- `./vulcanizedb sync --config <config.toml> --starting-block-number <block-number>`
## coldImport
Sync VulcanizeDB from the LevelDB underlying a Geth node.
1. Assure node is not running, and that it has synced to the desired block height.
1. Start vulcanize_db
- `./vulcanizedb coldImport --config <config.toml>`
1. Optional flags:
- `--starting-block-number <block number>`/`-s <block number>`: block number to start syncing from
- `--ending-block-number <block number>`/`-e <block number>`: block number to sync to
- `--all`/`-a`: sync all missing blocks

View File

@ -43,11 +43,15 @@ var FakeHeader = core.Header{
}
func GetFakeHeader(blockNumber int64) core.Header {
return GetFakeHeaderWithTimestamp(fakeTimestamp, blockNumber)
}
func GetFakeHeaderWithTimestamp(timestamp, blockNumber int64) core.Header {
return core.Header{
Hash: FakeHash.String(),
BlockNumber: blockNumber,
Raw: rawFakeHeader,
Timestamp: strconv.FormatInt(fakeTimestamp, 10),
Timestamp: strconv.FormatInt(timestamp, 10),
}
}

View File

@ -23,7 +23,7 @@
"dependencies": {
"express-session": "1.15.6",
"graphql-subscriptions": "0.5.8",
"lodash": "4.17.10",
"lodash": ">=4.17.11",
"passport": "0.4.0",
"pg": "6.4.2",
"pg-native": "3.0.0",
@ -48,7 +48,7 @@
"typescript": "3.0.1",
"webpack": "4.17.1",
"webpack-cli": "3.1.0",
"webpack-dev-server": "3.1.6"
"webpack-dev-server": ">=3.1.11"
},
"resolutions": {
"pg": "6.4.2"

File diff suppressed because it is too large Load Diff