Server backend for indexed ETH IPLD objects
Go to file
2019-02-26 00:52:54 -06:00
bin update supfile and deploy script for production 2018-09-24 10:24:06 -05:00
cmd core repo migrations in fixed form 2019-02-26 00:52:54 -06:00
db core repo migrations in fixed form 2019-02-26 00:52:54 -06:00
dockerfiles Nuke OpenRC from docker image, update readme's 2019-02-11 11:27:33 +01:00
environments rebase; extract factories and the mocks they are dependent on to 2019-02-25 01:34:38 -06:00
integration_test Fix small issues from review 2019-02-20 13:10:55 +01:00
libraries/shared rebase; extract factories and the mocks they are dependent on to 2019-02-25 01:34:38 -06:00
pkg rebase; extract factories and the mocks they are dependent on to 2019-02-25 01:34:38 -06:00
plugins removing mcd/maker references; delete test_data; update READMEs; use 2019-02-24 21:53:04 -06:00
postgraphile removing mcd/maker references; delete test_data; update READMEs; use 2019-02-24 21:53:04 -06:00
scripts remove pkg/transformers; extract shared files needed to 2019-02-24 21:38:47 -06:00
test_config rebase; extract factories and the mocks they are dependent on to 2019-02-25 01:34:38 -06:00
utils refactoring plugin generating code 2019-02-24 21:38:47 -06:00
vendor composeAndExecute command that loads and executes over arbitrary transformer set exported from a go plugin generated according to config file; test for plugin generation, loading, and execution; work on plugins README 2019-02-24 15:23:35 -06:00
.dockerignore Add container service files and docker README 2019-02-11 11:26:12 +01:00
.gitignore core repo migrations in fixed form 2019-02-26 00:52:54 -06:00
.private_blockchain_password Add integration test 2017-10-24 15:36:50 -05:00
.travis.yml core repo migrations in fixed form 2019-02-26 00:52:54 -06:00
Dockerfile Nuke OpenRC from docker image, update readme's 2019-02-11 11:27:33 +01:00
Gopkg.lock composeAndExecute command that loads and executes over arbitrary transformer set exported from a go plugin generated according to config file; test for plugin generation, loading, and execution; work on plugins README 2019-02-24 15:23:35 -06:00
Gopkg.toml Bump geth to 1.8.21 (#137) 2019-01-16 10:54:01 +01:00
LICENSE Edits to address PR issues; change license from apache to AGPL; and work 2018-11-15 12:32:52 -06:00
main.go use logrus for logging 2018-11-21 10:14:11 -06:00
Makefile removing mcd/maker references; delete test_data; update READMEs; use 2019-02-24 21:53:04 -06:00
README.md core repo migrations in fixed form 2019-02-26 00:52:54 -06:00
Supfile * update transformer to able to recheck headers (#4) 2019-02-08 10:35:46 -06:00

Vulcanize DB

Join the chat at https://gitter.im/vulcanizeio/VulcanizeDB

Build Status

About

Vulcanize DB is a set of tools that make it easier for developers to write application-specific indexes and caches for dapps built on Ethereum.

Dependencies

Project Setup

Using Vulcanize for the first time requires several steps be done in order to allow use of the software. The following instructions will offer a guide through the steps of the process:

  1. Fetching the project
  2. Installing dependencies
  3. Configuring shell environment
  4. Database setup
  5. Configuring synced Ethereum node integration
  6. Data syncing

Installation

In order to fetch the project codebase for local use or modification, install it to your GOPATH via:

go get github.com/vulcanize/vulcanizedb go get gopkg.in/DataDog/dd-trace-go.v1/ddtrace

Once fetched, dependencies can be installed via go get or (the preferred method) at specific versions via golang/dep, the prototype golang pakcage manager. Installation instructions are here.

In order to install packages with dep, ensure you are in the project directory now within your GOPATH (default location is ~/go/src/github.com/vulcanize/vulcanizedb/) and run:

dep ensure

After dep finishes, dependencies should be installed within your GOPATH at the versions specified in Gopkg.toml.

Lastly, ensure that GOPATH is defined in your shell. If necessary, GOPATH can be set in ~/.bashrc or ~/.bash_profile, depending upon your system. It can be additionally helpful to add $GOPATH/bin to your shell's $PATH.

Setting up the Database

  1. Install Postgres

  2. Create a superuser for yourself and make sure psql --list works without prompting for a password.

  3. createdb vulcanize_public

  4. cd $GOPATH/src/github.com/vulcanize/vulcanizedb

  5. Run the migrations: make migrate HOST_NAME=localhost NAME=vulcanize_public PORT=5432

    • To rollback a single step: make rollback NAME=vulcanize_public
    • To rollback to a certain migration: make rollback_to MIGRATION=n NAME=vulcanize_public
    • To see status of migrations: make migration_status NAME=vulcanize_public
    • See below for configuring additional environments

Create a migration file

  1. make new_migration NAME=add_columnA_to_table1
    • This will create a new timestamped migration file in db/migrations
  2. Write the migration code in the created file, under the respective goose pragma
    • Goose automatically runs each migration in a transaction; don't add BEGIN and COMMIT statements.

Configuration

  • To use a local Ethereum node, copy environments/public.toml.example to environments/public.toml and update the ipcPath and levelDbPath.

    • ipcPath should match the local node's IPC filepath:

      • For Geth:

        • The IPC file is called geth.ipc.
        • The geth IPC file path is printed to the console when you start geth.
        • The default location is:
          • Mac: <full home path>/Library/Ethereum
          • Linux: <full home path>/ethereum/geth.ipc
      • For Parity:

        • The IPC file is called jsonrpc.ipc.
        • The default location is:
          • Mac: <full home path>/Library/Application\ Support/io.parity.ethereum/
          • Linux: <full home path>/local/share/io.parity.ethereum/
    • levelDbPath should match Geth's chaindata directory path.

      • The geth LevelDB chaindata path is printed to the console when you start geth.
      • The default location is:
        • Mac: <full home path>/Library/Ethereum/geth/chaindata
        • Linux: <full home path>/ethereum/geth/chaindata
      • levelDbPath is irrelevant (and coldImport is currently unavailable) if only running parity.
  • See environments/infura.toml to configure commands to run against infura, if a local node is unavailable.

  • Copy environments/local.toml.example to environments/local.toml to configure commands to run against a local node such as Ganache or ganache-cli.

Start syncing with postgres

Syncs VulcanizeDB with the configured Ethereum node, populating blocks, transactions, receipts, and logs. This command is useful when you want to maintain a broad cache of what's happening on the blockchain.

  1. Start Ethereum node (if fast syncing your Ethereum node, wait for initial sync to finish)
  2. In a separate terminal start VulcanizeDB:
    • ./vulcanizedb sync --config <config.toml> --starting-block-number <block-number>

Alternatively, sync from Geth's underlying LevelDB

Sync VulcanizeDB from the LevelDB underlying a Geth node.

  1. Assure node is not running, and that it has synced to the desired block height.
  2. Start vulcanize_db
    • ./vulcanizedb coldImport --config <config.toml>
  3. Optional flags:
    • --starting-block-number <block number>/-s <block number>: block number to start syncing from
    • --ending-block-number <block number>/-e <block number>: block number to sync to
    • --all/-a: sync all missing blocks

Alternatively, sync in "light" mode

Syncs VulcanizeDB with the configured Ethereum node, populating only block headers. This command is useful when you want a minimal baseline from which to track targeted data on the blockchain (e.g. individual smart contract storage values).

  1. Start Ethereum node
  2. In a separate terminal start VulcanizeDB:
    • ./vulcanizedb lightSync --config <config.toml> --starting-block-number <block-number>

Start full environment in docker by single command

Geth Rinkeby

make command description
rinkeby_env_up start geth, postgres and rolling migrations, after migrations done starting vulcanizedb container
rinkeby_env_deploy build and run vulcanizedb container in rinkeby environment
rinkeby_env_migrate build and run rinkeby env migrations
rinkeby_env_down stop and remove all rinkeby env containers

Success run of the VulcanizeDB container require full geth state sync, attach to geth console and check sync state:

$ docker exec -it rinkeby_vulcanizedb_geth geth --rinkeby attach
...
> eth.syncing
false

If you have full rinkeby chaindata you can move it to rinkeby_vulcanizedb_geth_data docker volume to skip long wait of sync.

Running the Tests

  • createdb vulcanize_private will create the test db
  • make migrate NAME=vulcanize_private will run the db migrations
  • make test will run the unit tests and skip the integration tests
  • make integrationtest will run the just the integration tests

Deploying

  1. you will need to make sure you have ssh agent running and your ssh key added to it. instructions here
  2. go get -u github.com/pressly/sup/cmd/sup
  3. sup staging deploy

Contract Watchers

Contract watchers work with a light or full sync vDB to fetch raw ethereum data and execute a set of transformations over them, persisting the output.

A watcher is composed of at least a fetcher and a transformer or set of transformers, where a fetcher is an interface for retrieving raw Ethereum data from some source (e.g. eth_jsonrpc, IPFS) and a transformer is an interface for filtering through that raw Ethereum data to extract, process, and persist data for specific contracts or accounts.

omniWatcher

The omniWatcher command is a built-in generic contract watcher. It can watch any and all events for a given contract provided the contract's ABI is available. It also provides some state variable coverage by automating polling of public methods, with some restrictions.

This command requires a pre-synced (full or light) vulcanizeDB (see above sections) and currently requires the contract ABI be available on etherscan or provided by the user.

To watch all events of a contract using a light synced vDB:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address>

Or if you are using a full synced vDB, change the mode to full:
- Execute ./vulcanizedb omniWatcher --mode full --config <path to config.toml> --contract-address <contract address>

To watch contracts on a network other than mainnet, use the network flag:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --network <ropsten, kovan, or rinkeby>

To watch events starting at a certain block use the starting block flag: - Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --starting-block-number <#>

To watch only specified events use the events flag:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --events <EventName1> --events <EventName2>

To watch events and poll the specified methods with any addresses and hashes emitted by the watched events utilize the methods flag:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --methods <methodName1> --methods <methodName2>

To watch specified events and poll the specified method with any addresses and hashes emitted by the watched events:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --events <EventName1> --events <EventName2> --methods <methodName>

To turn on method piping so that values returned from previous method calls are cached and used as arguments in subsequent method calls:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --piping true --contract-address <contract address> --events <EventName1> --events <EventName2> --methods <methodName>

To watch all types of events of the contract but only persist the ones that emit one of the filtered-for argument values:
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --event-args <arg1> --event-args <arg2>

To watch all events of the contract but only poll the specified method with specified argument values (if they are emitted from the watched events):
- Execute ./vulcanizedb omniWatcher --config <path to config.toml> --contract-address <contract address> --methods <methodName> --method-args <arg1> --method-args <arg2>

omniWatcher output

Transformed events and polled method results are committed to Postgres in schemas and tables generated according to the contract abi.

Schemas are created for each contract using the naming convention <sync-type>_<lowercase contract-address>
Under this schema, tables are generated for watched events as <lowercase event name>_event and for polled methods as <lowercase method name>_method
The 'method' and 'event' identifiers are tacked onto the end of the table names to prevent collisions between methods and events of the same lowercase name

Example:

Running ./vulcanizedb omniWatcher --config <path to config> --starting-block-number=5197514 --contract-address=0x8dd5fbce2f6a956c3022ba3663759011dd51e73e --events=Transfer --events=Mint --methods=balanceOf watches Transfer and Mint events of the TrueUSD contract and polls its balanceOf method using the addresses we find emitted from those events

It produces and populates a schema with three tables:

light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event
light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.mint_event
light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method

Column ids and types for these tables are generated based on the event and method argument names and types and method return types, resulting in tables such as

Table "light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event"

Column Type Collation Nullable Default Storage Stats target Description
id integer not null nextval('light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.transfer_event_id_seq'::regclass) plain
header_id integer not null plain
token_name character varying(66) not null extended
raw_log jsonb extended
log_idx integer not null plain
tx_idx integer not null plain
from_ character varying(66) not null extended
to_ character varying(66) not null extended
value_ numeric not null main

and

Table "light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method"

Column Type Collation Nullable Default Storage Stats target Description
id integer not null nextval('light_0x8dd5fbce2f6a956c3022ba3663759011dd51e73e.balanceof_method_id_seq'::regclass) plain
token_name character varying(66) not null extended
block integer not null plain
who_ character varying(66) not null extended
returned numeric not null main

The addition of '_' after table names is to prevent collisions with reserved Postgres words

composeAndExecute

The composeAndExecute command is used to compose and execute over an arbitrary set of custom transformers. This is accomplished by generating a Go pluggin which allows our vulcanizedb binary to link to external transformers, so long as they abide by our standard interfaces.

This command requires Go 1.11+ and Go plugins only work on Unix based systems.

Writing custom transformers

Storage Transformers

Event Transformers

  • Guide
  • Example

composeAndExecute configuration

A .toml config file is specified when executing the command:
./vulcanizedb composeAndExecute --config=./environments/config_name.toml

The config provides information for composing a set of transformers:

[database]
    name     = "vulcanize_public"
    hostname = "localhost"
    user     = "vulcanize"
    password = "vulcanize"
    port     = 5432

[client]
    ipcPath  = "http://kovan0.vulcanize.io:8545"

[exporter]
    home     = "github.com/vulcanize/vulcanizedb"
    clone    = false
    name     = "exampleTransformerExporter"
    save     = false
    transformerNames = [
        "transformer1",
        "transformer2",
        "transformer3",
        "transformer4",
    ]
    [exporter.transformer1]
        path = "path/to/transformer1"
        type = "eth_event"
        repository = "github.com/account/repo"
        migrations = "db/migrations"
    [exporter.transformer2]
        path = "path/to/transformer2"
        type = "eth_event"
        repository = "github.com/account/repo"
        migrations = "db/migrations"
    [exporter.transformer3]
        path = "path/to/transformer3"
        type = "eth_event"
        repository = "github.com/account/repo"
        migrations = "db/migrations"
    [exporter.transformer4]
        path = "path/to/transformer4"
        type = "eth_storage"
        repository = "github.com/account2/repo2"
        migrations = "to/db/migrations"
  • home is the name of the package you are building the plugin for, in most cases this is github.com/vulcanize/vulcanizedb
  • clone this signifies whether or not to retrieve transformer packages by cloning them; by default we attempt to work with transformer packages located in our $GOPATH but setting this to true overrides that. This needs to be set to true for the configs used in tests in order for them to work with Travis.
  • name is the name used for the plugin files (.so and .go)
  • save indicates whether or not the user wants to save the .go file instead of removing it after .so compilation. Sometimes useful for debugging/trouble-shooting purposes.
  • transformerNames is the list of the names of the transformers we are composing together, so we know how to access their submaps in the exporter map
  • exporter.<transformerName>s are the sub-mappings containing config info for the transformers
    • repository is the path for the repository which contains the transformer and its TransformerInitializer
    • path is the relative path from repository to the transformer's TransformerInitializer directory (initializer package)
    • type is the type of the transformer; indicating which type of watcher it works with (for now, there are only two options: eth_event and eth_storage)
      • eth_storage indicates the transformer works with the storage watcher that fetches state and storage diffs from an ETH node (instead of, for example, from IPFS)
      • eth_event indicates the transformer works with the event watcher that fetches event logs from an ETH node
    • migrations is the relative path from repository to the db migrations directory for the transformer
  • Note: If any of the imported transformers need additional config variables those need to be included as well

This information is used to write and build a Go plugin which exports the configured transformers. These transformers are loaded onto their specified watchers and executed.

Transformers of different types can be run together in the same command using a single config file or in separate instances using different config files

The general structure of a plugin .go file, and what we would see built with the above config is shown below

package main

import (
	interface1 "github.com/vulcanize/vulcanizedb/libraries/shared/transformer"
	transformer1 "github.com/account/repo/path/to/transformer1"
	transformer2 "github.com/account/repo/path/to/transformer2"
	transformer3 "github.com/account/repo/path/to/transformer3"
	transformer4 "github.com/account2/repo2/path/to/transformer4"
)

type exporter string

var Exporter exporter

func (e exporter) Export() []interface1.TransformerInitializer, []interface1.StorageTransformerInitializer {
	return []interface1.TransformerInitializer{
		transformer1.TransformerInitializer,
		transformer2.TransformerInitializer,
		transformer3.TransformerInitializer,
	},     []interface1.StorageTransformerInitializer{
		transformer4.StorageTransformerInitializer,
    }
}

Preparing transformer(s) to work as pluggins for composeAndExecute

To plug in an external transformer we need to:

  • create a package that exports a variable TransformerInitializer or StorageTransformerInitializer that are of type TransformerInitializer or StorageTransformerInitializer, respectively
  • design the transformers to work in the context of their event or storage watchers
  • create db migrations to run against vulcanizeDB so that we can store the transformer output
    • specify migration locations for each transformer in the config with the exporter.transformer.migrations fields
    • do not goose fix the transformer migrations