forked from cerc-io/ipld-eth-server
136 lines
8.2 KiB
Markdown
136 lines
8.2 KiB
Markdown
|
# ipfs-blockchain-watcher architecture
|
||
|
1. [Processes](#processes)
|
||
|
1. [Command](#command)
|
||
|
1. [Configuration](#config)
|
||
|
1. [Database](#database)
|
||
|
1. [APIs](#apis)
|
||
|
1. [Resync](#resync)
|
||
|
1. [IPFS Considerations](#ipfs-considerations)
|
||
|
|
||
|
## Processes
|
||
|
ipfs-blockchain-watcher is a [service](../pkg/super_node/service.go#L61) comprised of the following interfaces:
|
||
|
|
||
|
* [Payload Fetcher](../pkg/super_node/shared/interfaces.go#L29): Fetches raw chain data from a half-duplex endpoint (HTTP/IPC), used for historical data fetching. ([BTC](../../pkg/super_node/btc/payload_fetcher.go), [ETH](../../pkg/super_node/eth/payload_fetcher.go)).
|
||
|
* [Payload Streamer](../pkg/super_node/shared/interfaces.go#L24): Streams raw chain data from a full-duplex endpoint (WebSocket/IPC), used for syncing data at the head of the chain in real-time. ([BTC](../../pkg/super_node/btc/http_streamer.go), [ETH](../../pkg/super_node/eth/streamer.go)).
|
||
|
* [Payload Converter](../pkg/super_node/shared/interfaces.go#L34): Converters raw chain data to an intermediary form prepared for IPFS publishing. ([BTC](../../pkg/super_node/btc/converter.go), [ETH](../../pkg/super_node/eth/converter.go)).
|
||
|
* [IPLD Publisher](../pkg/super_node/shared/interfaces.go#L39): Publishes the converted data to IPFS, returning their CIDs and associated metadata for indexing. ([BTC](../../pkg/super_node/btc/publisher.go), [ETH](../../pkg/super_node/eth/publisher.go)).
|
||
|
* [CID Indexer](../pkg/super_node/shared/interfaces.go#L44): Indexes CIDs in Postgres with their associated metadata. This metadata is chain specific and selected based on utility. ([BTC](../../pkg/super_node/btc/indexer.go), [ETH](../../pkg/super_node/eth/indexer.go)).
|
||
|
* [CID Retriever](../pkg/super_node/shared/interfaces.go#L54): Retrieves CIDs from Postgres by searching against their associated metadata, is used to lookup data to serve API requests/subscriptions. ([BTC](../../pkg/super_node/btc/retriever.go), [ETH](../../pkg/super_node/eth/retriever.go)).
|
||
|
* [IPLD Fetcher](../pkg/super_node/shared/interfaces.go#L62): Fetches the IPLDs needed to service API requests/subscriptions from IPFS using retrieved CIDS; can route through a IPFS block-exchange to search for objects that are not directly available. ([BTC](../../pkg/super_node/btc/ipld_fetcher.go), [ETH](../../pkg/super_node/eth/ipld_fetcher.go))
|
||
|
* [Response Filterer](../pkg/super_node/shared/interfaces.go#L49): Filters converted data payloads served to API subscriptions; filters according to the subscriber provided parameters. ([BTC](../../pkg/super_node/btc/filterer.go), [ETH](../../pkg/super_node/eth/filterer.go)).
|
||
|
* [API](https://github.com/ethereum/go-ethereum/blob/master/rpc/types.go#L31): Expose RPC methods for clients to interface with the data. Chain-specific APIs should aim to recapitulate as much of the native API as possible. ([VDB](../../pkg/super_node/api.go), [ETH](../../pkg/super_node/eth/api.go)).
|
||
|
|
||
|
|
||
|
Appropriating the service for a new chain is done by creating underlying types to satisfy these interfaces for
|
||
|
the specifics of that chain.
|
||
|
|
||
|
The service uses these interfaces to operate in any combination of three modes: `sync`, `serve`, and `backfill`.
|
||
|
* Sync: Streams raw chain data at the head, converts and publishes it to IPFS, and indexes the resulting set of CIDs in Postgres with useful metadata.
|
||
|
* BackFill: Automatically searches for and detects gaps in the DB; fetches, converts, publishes, and indexes the data to fill these gaps.
|
||
|
* Serve: Opens up IPC, HTTP, and WebSocket servers on top of the ipfs-blockchain-watcher DB and any concurrent sync and/or backfill processes.
|
||
|
|
||
|
|
||
|
These three modes are all operated through a single vulcanizeDB command: `watch`
|
||
|
|
||
|
## Command
|
||
|
|
||
|
Usage: `./ipfs-blockchain-watcher watch --config={config.toml}`
|
||
|
|
||
|
Configuration can also be done through CLI options and/or environmental variables.
|
||
|
CLI options can be found using `./ipfs-blockchain-watcher watch --help`.
|
||
|
|
||
|
## Config
|
||
|
|
||
|
Below is the set of universal config parameters for the ipfs-blockchain-watcher command, in .toml form, with the respective environmental variables commented to the side.
|
||
|
This set of parameters needs to be set no matter the chain type.
|
||
|
|
||
|
```toml
|
||
|
[database]
|
||
|
name = "vulcanize_public" # $DATABASE_NAME
|
||
|
hostname = "localhost" # $DATABASE_HOSTNAME
|
||
|
port = 5432 # $DATABASE_PORT
|
||
|
user = "vdbm" # $DATABASE_USER
|
||
|
password = "" # $DATABASE_PASSWORD
|
||
|
|
||
|
[ipfs]
|
||
|
path = "~/.ipfs" # $IPFS_PATH
|
||
|
mode = "direct" # $IPFS_MODE
|
||
|
|
||
|
[superNode]
|
||
|
chain = "bitcoin" # $SUPERNODE_CHAIN
|
||
|
server = true # $SUPERNODE_SERVER
|
||
|
ipcPath = "~/.vulcanize/vulcanize.ipc" # $SUPERNODE_IPC_PATH
|
||
|
wsPath = "127.0.0.1:8082" # $SUPERNODE_WS_PATH
|
||
|
httpPath = "127.0.0.1:8083" # $SUPERNODE_HTTP_PATH
|
||
|
sync = true # $SUPERNODE_SYNC
|
||
|
workers = 1 # $SUPERNODE_WORKERS
|
||
|
backFill = true # $SUPERNODE_BACKFILL
|
||
|
frequency = 45 # $SUPERNODE_FREQUENCY
|
||
|
batchSize = 1 # $SUPERNODE_BATCH_SIZE
|
||
|
batchNumber = 50 # $SUPERNODE_BATCH_NUMBER
|
||
|
timeout = 300 # $HTTP_TIMEOUT
|
||
|
validationLevel = 1 # $SUPERNODE_VALIDATION_LEVEL
|
||
|
```
|
||
|
|
||
|
Additional parameters need to be set depending on the specific chain.
|
||
|
|
||
|
For Bitcoin:
|
||
|
|
||
|
```toml
|
||
|
[bitcoin]
|
||
|
wsPath = "127.0.0.1:8332" # $BTC_WS_PATH
|
||
|
httpPath = "127.0.0.1:8332" # $BTC_HTTP_PATH
|
||
|
pass = "password" # $BTC_NODE_PASSWORD
|
||
|
user = "username" # $BTC_NODE_USER
|
||
|
nodeID = "ocd0" # $BTC_NODE_ID
|
||
|
clientName = "Omnicore" # $BTC_CLIENT_NAME
|
||
|
genesisBlock = "000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f" # $BTC_GENESIS_BLOCK
|
||
|
networkID = "0xD9B4BEF9" # $BTC_NETWORK_ID
|
||
|
```
|
||
|
|
||
|
For Ethereum:
|
||
|
|
||
|
```toml
|
||
|
[ethereum]
|
||
|
wsPath = "127.0.0.1:8546" # $ETH_WS_PATH
|
||
|
httpPath = "127.0.0.1:8545" # $ETH_HTTP_PATH
|
||
|
nodeID = "arch1" # $ETH_NODE_ID
|
||
|
clientName = "Geth" # $ETH_CLIENT_NAME
|
||
|
genesisBlock = "0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3" # $ETH_GENESIS_BLOCK
|
||
|
networkID = "1" # $ETH_NETWORK_ID
|
||
|
```
|
||
|
|
||
|
## Database
|
||
|
|
||
|
Currently, ipfs-blockchain-watcher persists all data to a single Postgres database. The migrations for this DB can be found [here](../../db/migrations).
|
||
|
Chain-specific data is populated under a chain-specific schema (e.g. `eth` and `btc`) while shared data- such as the IPFS blocks table- is populated under the `public` schema.
|
||
|
Subsequent watchers which act on the raw chain data should build and populate their own schemas or separate databases entirely.
|
||
|
|
||
|
In the future, we will be moving to a foreign table based architecture wherein a single db is used for shared data while each watcher uses
|
||
|
its own database and accesses and acts on the shared data through foreign tables. Isolating watchers to their own databases will prevent complications and
|
||
|
conflicts between watcher db migrations.
|
||
|
|
||
|
|
||
|
## APIs
|
||
|
|
||
|
ipfs-blockchain-watcher provides mutliple types of APIs by which to interface with its data.
|
||
|
More detailed information on the APIs can be found [here](apis.md).
|
||
|
|
||
|
## Resync
|
||
|
|
||
|
A separate command `resync` is available for directing the resyncing of data within specified ranges.
|
||
|
This is useful if we want to re-validate a range of data using a new source or clean out bad/deprecated data.
|
||
|
More detailed information on this command can be found [here](resync.md).
|
||
|
|
||
|
## IPFS Considerations
|
||
|
|
||
|
Currently the IPLD Publisher and Fetcher can either use internalized IPFS processes which interface with a local IPFS repository, or can interface
|
||
|
directly with the backing Postgres database.
|
||
|
Both these options circumvent the need to run a full IPFS daemon with a [go-ipld-eth](https://github.com/ipfs/go-ipld-eth) or [go-ipld-btc](https://github.com/ipld/go-ipld-btc) plugin.
|
||
|
The former approach can lead to issues with lock-contention on the IPFS repo if another IPFS process is configured and running at the same $IPFS_PATH, it also necessitates the need for
|
||
|
a locally configured IPFS repository. The later bypasses the need for a configured IPFS repository/$IPFS_PATH and allows all Postgres write operations at a given block height
|
||
|
to occur in a single transaction, the only disadvantage is that by avoiding moving through an IPFS node intermediary we lose the direct ability to reach out to the block
|
||
|
exchange for data we do not have locally.
|
||
|
|
||
|
Once go-ipld-eth and go-ipld-btc have been updated to work with a modern version of PG-IPFS, an additional option will be provided to direct
|
||
|
all publishing and fetching of IPLD objects through a remote IPFS daemon.
|