Statediffing geth
* Write state diff to CSV (#2)
* port statediff from https://github.com/jpmorganchase/quorum/blob/9b7fd9af8082795eeeb6863d9746f12b82dd5078/statediff/statediff.go; minor fixes
* integrating state diff extracting, building, and persisting into geth processes
* work towards persisting created statediffs in ipfs; based off github.com/vulcanize/eth-block-extractor
* Add a state diff service
* Remove diff extractor from blockchain
* Update imports
* Move statediff on/off check to geth cmd config
* Update starting state diff service
* Add debugging logs for creating diff
* Add statediff extractor and builder tests and small refactoring
* Start to write statediff to a CSV
* Restructure statediff directory
* Pull CSV publishing methods into their own file
* Reformatting due to go fmt
* Add gomega to vendor dir
* Remove testing focuses
* Update statediff tests to use golang test pkg
instead of ginkgo
- builder_test
- extractor_test
- publisher_test
* Use hexutil.Encode instead of deprecated common.ToHex
* Remove OldValue from DiffBigInt and DiffUint64 fields
* Update builder test
* Remove old storage value from updated accounts
* Remove old values from created/deleted accounts
* Update publisher to account for only storing current account values
* Update service loop and fetching previous block
* Update testing
- remove statediff ginkgo test suite file
- move mocks to their own dir
* Updates per go fmt
* Updates to tests
* Pass statediff mode and path in through cli
* Return filename from publisher
* Remove some duplication in builder
* Remove code field from state diff output
this is the contract byte code, and it can still be obtained by querying
the db by the codeHash
* Consolidate acct diff structs for updated & updated/deleted accts
* Include block number in csv filename
* Clean up error logging
* Cleanup formatting, spelling, etc
* Address PR comments
* Add contract address and storage value to csv
* Refactor accumulating account row in csv publisher
* Add DiffStorage struct
* Add storage key to csv
* Address PR comments
* Fix publisher to include rows for accounts that don't have store updates
* Update builder test after merging in release/1.8
* Update test contract to include storage on contract intialization
- so that we're able to test that storage diffing works for created and
deleted accounts (not just updated accounts).
* Factor out a common trie iterator method in builder
* Apply goimports to statediff
* Apply gosimple changes to statediff
* Gracefully exit geth command(#4)
* Statediff for full node (#6)
* Open a trie from the in-memory database
* Use a node's LeafKey as an identifier instead of the address
It was proving difficult to find look the address up from a given path
with a full node (sometimes the value wouldn't exist in the disk db).
So, instead, for now we are using the node's LeafKey with is a Keccak256
hash of the address, so if we know the address we can figure out which
LeafKey it matches up to.
* Make sure that statediff has been processed before pruning
* Use blockchain stateCache.OpenTrie for storage diffs
* Clean up log lines and remove unnecessary fields from builder
* Apply go fmt changes
* Add a sleep to the blockchain test
* Address PR comments
* Address PR comments
* refactoring/reorganizing packages
* refactoring statediff builder and types and adjusted to relay proofs and paths (still need to make this optional)
* refactoring state diff service and adding api which allows for streaming state diff payloads over an rpc websocket subscription
* make proofs and paths optional + compress service loop into single for loop (may be missing something here)
* option to process intermediate nodes
* make state diff rlp serializable
* cli parameter to limit statediffing to select account addresses + test
* review fixes and fixes for issues ran into in integration
* review fixes; proper method signature for api; adjust service so that statediff processing is halted/paused until there is at least one subscriber listening for the results
* adjust buffering to improve stability; doc.go; fix notifier
err handling
* relay receipts with the rest of the data + review fixes/changes
* rpc method to get statediff at specific block; requires archival node or the block be within the pruning range
* review fixes
* fixes after rebase
* statediff verison meta
* fix linter issues
* include total difficulty to the payload
* fix state diff builder: emit actual leaf nodes instead of value nodes; diff on the leaf not on the value; emit correct path for intermediate nodes
* adjust statediff builder tests to changes and extend to test intermediate nodes; golint
* add genesis block to test; handle block 0 in StateDiffAt
* rlp files for mainnet blocks 0-3, for tests
* builder test on mainnet blocks
* common.BytesToHash(path) => crypto.Keaccak256(hash) in builder; BytesToHash produces same hash for e.g. []byte{} and []byte{\x00} - prefix \x00 steps are inconsequential to the hash result
* complete tests for early mainnet blocks
* diff type for representing deleted accounts
* fix builder so that we handle account deletions properly and properly diff storage when an account is moved to a new path; update params
* remove cli params; moving them to subscriber defined
* remove unneeded bc methods
* update service and api; statediffing params are now defined by user through api rather than by service provider by cli
* update top level tests
* add ability to watch specific storage slots (leaf keys) only
* comments; explain logic
* update mainnet blocks test
* update api_test.go
* storage leafkey filter test
* cleanup chain maker
* adjust chain maker for tests to add an empty account in block1 and switch to EIP-158 afterwards (now we just need to generate enough accounts until one causes the empty account to be touched and removed post-EIP-158 so we can simulate and test that process...); also added 2 new blocks where more contract storage is set and old slots are set to zero so they are removed so we can test that
* found an account whose creation causes the empty account to be moved to a new path; this should count as 'touching; the empty account and cause it to be removed according to eip-158... but it doesn't
* use new contract in unit tests that has self-destruct ability, so we can test eip-158 since simply moving an account to new path doesn't count as 'touchin' it
* handle storage deletions
* tests for eip-158 account removal and storage value deletions; there is one edge case left to test where we remove 1 account when only two exist such that the remaining account is moved up and replaces the root branch node
* finish testing known edge cases
* add endpoint to fetch all state and storage nodes at a given blockheight; useful for generating a recent atate cache/snapshot that we can diff forward from rather than needing to collect all diffs from genesis
* test for state trie builder
* minor changes/fixes
* update version meta
* if statediffing is on, lock tries in triedb until the statediffing service signals they are done using them
* update version meta
* fix mock blockchain; golint; bump patch
* increase maxRequestContentLength; bump patch
* log the sizes of the state objects we are sending
* CI build (#20)
* CI: run build on PR and on push to master
* CI: debug building geth
* CI: fix coping file
* CI: fix coping file v2
* CI: temporary upload file to release asset
* CI: get release upload_url by tag, upload asset to current relase
* CI: fix tag name
* fix ci build on statediff_at_anyblock-1.9.11 branch
* fix publishing assets in release
* bump version meta
* use context deadline for timeout in eth_call
* collect and emit codehash=>code mappings for state objects
* subscription endpoint for retrieving all the codehash=>code mappings that exist at provided height
* bump version meta
* Implement WriteStateDiffAt
* Writes state diffs directly to postgres
* Adds CLI flags to configure PG
* Refactors builder output with callbacks
* Copies refactored postgres handling code from ipld-eth-indexer
* rename PostgresCIDWriter.{index->upsert}*
* less ambiguous
* go.mod update
* rm unused
* cleanup
* output code & codehash iteratively
* had to rf some types for this
* prometheus metrics output
* duplicate recent eth-indexer changes
* migrations and metrics...
* [wip] prom.Init() here? another CLI flag?
* cleanup
* tidy & DRY
* statediff WriteLoop service + CLI flag
* [wip] update test mocks
* todo - do something meaningful to test write loop
* logging
* use geth log
* port tests to go testing
* drop ginkgo/gomega
* fix and cleanup tests
* fail before defer statement
* delete vendor/ dir
* unused
* bump version meta
* fixes after rebase onto 1.9.23
* bump version meta
* fix API registration
* bump version meta
* use golang 1.15.5 version (#34)
* bump version meta; add 0.0.11 branch to actions
* bump version meta; update github actions workflows
* statediff: refactor metrics
* Remove redundant statediff/indexer/prom tooling and use existing
prometheus integration.
* cleanup
* "indexer" namespace for metrics
* add reporting loop for db metrics
* doc
* metrics for statediff stats
* metrics namespace/subsystem = statediff/{indexer,service}
* statediff: use a worker pool (for direct writes)
* fix test
* fix chain event subscription
* log tweaks
* func name
* unused import
* intermediate chain event channel for metrics
* cleanup
* bump version meta
* update github actions; linting
* add poststate and status to receipt ipld indexes
* bump statediff version
* stateDiffFor endpoints for fetching or writing statediff object by blockhash; bump statediff version
* fixes after rebase on to v1.10.1
* update github actions and version meta; go fmt
* add leaf key to removed 'nodes'
* include Postgres migrations and schema
* service documentation
* touching up
2019-01-28 21:31:01 +00:00
# Statediff
This package provides an auxiliary service that asynchronously processes state diff objects from chain events,
either relaying the state objects to RPC subscribers or writing them directly to Postgres as IPLD objects.
It also exposes RPC endpoints for fetching or writing to Postgres the state diff at a specific block height
or for a specific block hash, this operates on historical block and state data and so depends on a complete state archive.
Data is emitted in this differential format in order to make it feasible to IPLD-ize and index the *entire* Ethereum state
(including intermediate state and storage trie nodes). If this state diff process is ran continuously from genesis,
the entire state at any block can be materialized from the cumulative differentials up to that point.
## Statediff object
A state diff `StateObject` is the collection of all the state and storage trie nodes that have been updated in a given block.
For convenience, we also associate these nodes with the block number and hash, and optionally the set of code hashes and code for any
contracts deployed in this block.
A complete state diff `StateObject` will include all state and storage intermediate nodes, which is necessary for generating proofs and for
traversing the tries.
```go
// StateObject is a collection of state (and linked storage nodes) as well as the associated block number, block hash,
// and a set of code hashes and their code
type StateObject struct {
BlockNumber *big.Int `json:"blockNumber" gencodec:"required"`
BlockHash common.Hash `json:"blockHash" gencodec:"required"`
Nodes []StateNode `json:"nodes" gencodec:"required"`
CodeAndCodeHashes []CodeAndCodeHash `json:"codeMapping"`
}
// StateNode holds the data for a single state diff node
type StateNode struct {
NodeType NodeType `json:"nodeType" gencodec:"required"`
Path []byte `json:"path" gencodec:"required"`
NodeValue []byte `json:"value" gencodec:"required"`
StorageNodes []StorageNode `json:"storage"`
LeafKey []byte `json:"leafKey"`
}
// StorageNode holds the data for a single storage diff node
type StorageNode struct {
NodeType NodeType `json:"nodeType" gencodec:"required"`
Path []byte `json:"path" gencodec:"required"`
NodeValue []byte `json:"value" gencodec:"required"`
LeafKey []byte `json:"leafKey"`
}
// CodeAndCodeHash struct for holding codehash => code mappings
// we can't use an actual map because they are not rlp serializable
type CodeAndCodeHash struct {
Hash common.Hash `json:"codeHash"`
Code []byte `json:"code"`
}
```
These objects are packed into a `Payload` structure which can additionally associate the `StateObject`
with the block (header, uncles, and transactions), receipts, and total difficulty.
This `Payload` encapsulates all of the differential data at a given block, and allows us to index the entire Ethereum data structure
as hash-linked IPLD objects.
```go
// Payload packages the data to send to state diff subscriptions
type Payload struct {
BlockRlp []byte `json:"blockRlp"`
TotalDifficulty *big.Int `json:"totalDifficulty"`
ReceiptsRlp []byte `json:"receiptsRlp"`
StateObjectRlp []byte `json:"stateObjectRlp" gencodec:"required"`
encoded []byte
err error
}
```
## Usage
This state diffing service runs as an auxiliary service concurrent to the regular syncing process of the geth node.
### CLI configuration
This service introduces a CLI flag namespace `statediff`
`--statediff` flag is used to turn on the service
`--statediff.writing` is used to tell the service to write state diff objects it produces from synced ChainEvents directly to a configured Postgres database
2021-04-15 18:10:49 +00:00
`--statediff.workers` is used to set the number of concurrent workers to process state diff objects and write them into the database
Statediffing geth
* Write state diff to CSV (#2)
* port statediff from https://github.com/jpmorganchase/quorum/blob/9b7fd9af8082795eeeb6863d9746f12b82dd5078/statediff/statediff.go; minor fixes
* integrating state diff extracting, building, and persisting into geth processes
* work towards persisting created statediffs in ipfs; based off github.com/vulcanize/eth-block-extractor
* Add a state diff service
* Remove diff extractor from blockchain
* Update imports
* Move statediff on/off check to geth cmd config
* Update starting state diff service
* Add debugging logs for creating diff
* Add statediff extractor and builder tests and small refactoring
* Start to write statediff to a CSV
* Restructure statediff directory
* Pull CSV publishing methods into their own file
* Reformatting due to go fmt
* Add gomega to vendor dir
* Remove testing focuses
* Update statediff tests to use golang test pkg
instead of ginkgo
- builder_test
- extractor_test
- publisher_test
* Use hexutil.Encode instead of deprecated common.ToHex
* Remove OldValue from DiffBigInt and DiffUint64 fields
* Update builder test
* Remove old storage value from updated accounts
* Remove old values from created/deleted accounts
* Update publisher to account for only storing current account values
* Update service loop and fetching previous block
* Update testing
- remove statediff ginkgo test suite file
- move mocks to their own dir
* Updates per go fmt
* Updates to tests
* Pass statediff mode and path in through cli
* Return filename from publisher
* Remove some duplication in builder
* Remove code field from state diff output
this is the contract byte code, and it can still be obtained by querying
the db by the codeHash
* Consolidate acct diff structs for updated & updated/deleted accts
* Include block number in csv filename
* Clean up error logging
* Cleanup formatting, spelling, etc
* Address PR comments
* Add contract address and storage value to csv
* Refactor accumulating account row in csv publisher
* Add DiffStorage struct
* Add storage key to csv
* Address PR comments
* Fix publisher to include rows for accounts that don't have store updates
* Update builder test after merging in release/1.8
* Update test contract to include storage on contract intialization
- so that we're able to test that storage diffing works for created and
deleted accounts (not just updated accounts).
* Factor out a common trie iterator method in builder
* Apply goimports to statediff
* Apply gosimple changes to statediff
* Gracefully exit geth command(#4)
* Statediff for full node (#6)
* Open a trie from the in-memory database
* Use a node's LeafKey as an identifier instead of the address
It was proving difficult to find look the address up from a given path
with a full node (sometimes the value wouldn't exist in the disk db).
So, instead, for now we are using the node's LeafKey with is a Keccak256
hash of the address, so if we know the address we can figure out which
LeafKey it matches up to.
* Make sure that statediff has been processed before pruning
* Use blockchain stateCache.OpenTrie for storage diffs
* Clean up log lines and remove unnecessary fields from builder
* Apply go fmt changes
* Add a sleep to the blockchain test
* Address PR comments
* Address PR comments
* refactoring/reorganizing packages
* refactoring statediff builder and types and adjusted to relay proofs and paths (still need to make this optional)
* refactoring state diff service and adding api which allows for streaming state diff payloads over an rpc websocket subscription
* make proofs and paths optional + compress service loop into single for loop (may be missing something here)
* option to process intermediate nodes
* make state diff rlp serializable
* cli parameter to limit statediffing to select account addresses + test
* review fixes and fixes for issues ran into in integration
* review fixes; proper method signature for api; adjust service so that statediff processing is halted/paused until there is at least one subscriber listening for the results
* adjust buffering to improve stability; doc.go; fix notifier
err handling
* relay receipts with the rest of the data + review fixes/changes
* rpc method to get statediff at specific block; requires archival node or the block be within the pruning range
* review fixes
* fixes after rebase
* statediff verison meta
* fix linter issues
* include total difficulty to the payload
* fix state diff builder: emit actual leaf nodes instead of value nodes; diff on the leaf not on the value; emit correct path for intermediate nodes
* adjust statediff builder tests to changes and extend to test intermediate nodes; golint
* add genesis block to test; handle block 0 in StateDiffAt
* rlp files for mainnet blocks 0-3, for tests
* builder test on mainnet blocks
* common.BytesToHash(path) => crypto.Keaccak256(hash) in builder; BytesToHash produces same hash for e.g. []byte{} and []byte{\x00} - prefix \x00 steps are inconsequential to the hash result
* complete tests for early mainnet blocks
* diff type for representing deleted accounts
* fix builder so that we handle account deletions properly and properly diff storage when an account is moved to a new path; update params
* remove cli params; moving them to subscriber defined
* remove unneeded bc methods
* update service and api; statediffing params are now defined by user through api rather than by service provider by cli
* update top level tests
* add ability to watch specific storage slots (leaf keys) only
* comments; explain logic
* update mainnet blocks test
* update api_test.go
* storage leafkey filter test
* cleanup chain maker
* adjust chain maker for tests to add an empty account in block1 and switch to EIP-158 afterwards (now we just need to generate enough accounts until one causes the empty account to be touched and removed post-EIP-158 so we can simulate and test that process...); also added 2 new blocks where more contract storage is set and old slots are set to zero so they are removed so we can test that
* found an account whose creation causes the empty account to be moved to a new path; this should count as 'touching; the empty account and cause it to be removed according to eip-158... but it doesn't
* use new contract in unit tests that has self-destruct ability, so we can test eip-158 since simply moving an account to new path doesn't count as 'touchin' it
* handle storage deletions
* tests for eip-158 account removal and storage value deletions; there is one edge case left to test where we remove 1 account when only two exist such that the remaining account is moved up and replaces the root branch node
* finish testing known edge cases
* add endpoint to fetch all state and storage nodes at a given blockheight; useful for generating a recent atate cache/snapshot that we can diff forward from rather than needing to collect all diffs from genesis
* test for state trie builder
* minor changes/fixes
* update version meta
* if statediffing is on, lock tries in triedb until the statediffing service signals they are done using them
* update version meta
* fix mock blockchain; golint; bump patch
* increase maxRequestContentLength; bump patch
* log the sizes of the state objects we are sending
* CI build (#20)
* CI: run build on PR and on push to master
* CI: debug building geth
* CI: fix coping file
* CI: fix coping file v2
* CI: temporary upload file to release asset
* CI: get release upload_url by tag, upload asset to current relase
* CI: fix tag name
* fix ci build on statediff_at_anyblock-1.9.11 branch
* fix publishing assets in release
* bump version meta
* use context deadline for timeout in eth_call
* collect and emit codehash=>code mappings for state objects
* subscription endpoint for retrieving all the codehash=>code mappings that exist at provided height
* bump version meta
* Implement WriteStateDiffAt
* Writes state diffs directly to postgres
* Adds CLI flags to configure PG
* Refactors builder output with callbacks
* Copies refactored postgres handling code from ipld-eth-indexer
* rename PostgresCIDWriter.{index->upsert}*
* less ambiguous
* go.mod update
* rm unused
* cleanup
* output code & codehash iteratively
* had to rf some types for this
* prometheus metrics output
* duplicate recent eth-indexer changes
* migrations and metrics...
* [wip] prom.Init() here? another CLI flag?
* cleanup
* tidy & DRY
* statediff WriteLoop service + CLI flag
* [wip] update test mocks
* todo - do something meaningful to test write loop
* logging
* use geth log
* port tests to go testing
* drop ginkgo/gomega
* fix and cleanup tests
* fail before defer statement
* delete vendor/ dir
* unused
* bump version meta
* fixes after rebase onto 1.9.23
* bump version meta
* fix API registration
* bump version meta
* use golang 1.15.5 version (#34)
* bump version meta; add 0.0.11 branch to actions
* bump version meta; update github actions workflows
* statediff: refactor metrics
* Remove redundant statediff/indexer/prom tooling and use existing
prometheus integration.
* cleanup
* "indexer" namespace for metrics
* add reporting loop for db metrics
* doc
* metrics for statediff stats
* metrics namespace/subsystem = statediff/{indexer,service}
* statediff: use a worker pool (for direct writes)
* fix test
* fix chain event subscription
* log tweaks
* func name
* unused import
* intermediate chain event channel for metrics
* cleanup
* bump version meta
* update github actions; linting
* add poststate and status to receipt ipld indexes
* bump statediff version
* stateDiffFor endpoints for fetching or writing statediff object by blockhash; bump statediff version
* fixes after rebase on to v1.10.1
* update github actions and version meta; go fmt
* add leaf key to removed 'nodes'
* include Postgres migrations and schema
* service documentation
* touching up
2019-01-28 21:31:01 +00:00
`--statediff.db` is the connection string for the Postgres database to write to
`--statediff.dbnodeid` is the node id to use in the Postgres database
`--statediff.dbclientname` is the client name to use in the Postgres database
The service can only operate in full sync mode (`--syncmode=full`), but only the historical RPC endpoints require an archive node (`--gcmode=archive`)
e.g.
`
./build/bin/geth --syncmode=full --gcmode=archive --statediff --statediff.writing --statediff.db=postgres://localhost:5432/vulcanize_testing?sslmode=disable --statediff.dbnodeid={nodeId} --statediff.dbclientname={dbClientName}
`
### RPC endpoints
The state diffing service exposes both a WS subscription endpoint, and a number of HTTP unary endpoints.
Each of these endpoints requires a set of parameters provided by the caller
```go
// Params is used to carry in parameters from subscribing/requesting clients configuration
type Params struct {
IntermediateStateNodes bool
IntermediateStorageNodes bool
IncludeBlock bool
IncludeReceipts bool
IncludeTD bool
IncludeCode bool
WatchedAddresses []common.Address
WatchedStorageSlots []common.Hash
}
```
Using these params we can tell the service whether to include state and/or storage intermediate nodes; whether
to include the associated block (header, uncles, and transactions); whether to include the associated receipts;
whether to include the total difficulty for this block; whether to include the set of code hashes and code for
contracts deployed in this block; whether to limit the diffing process to a list of specific addresses; and/or
whether to limit the diffing process to a list of specific storage slot keys.
#### Subscription endpoint
A websocket supporting RPC endpoint is exposed for subscribing to state diff `StateObjects` that come off the head of the chain while the geth node syncs.
```go
// Stream is a subscription endpoint that fires off state diff payloads as they are created
Stream(ctx context.Context, params Params) (*rpc.Subscription, error)
```
To expose this endpoint the node needs to have the websocket server turned on (`--ws`),
and the `statediff` namespace exposed (`--ws.api=statediff`).
Go code subscriptions to this endpoint can be created using the `rpc.Client.Subscribe()` method,
with the "statediff" namespace, a `statediff.Payload` channel, and the name of the statediff api's rpc method: "stream".
e.g.
```go
cli, err := rpc.Dial("ipcPathOrWsURL")
if err != nil {
// handle error
}
stateDiffPayloadChan := make(chan statediff.Payload, 20000)
methodName := "stream"
params := statediff.Params{
IncludeBlock: true,
IncludeTD: true,
IncludeReceipts: true,
IntermediateStorageNodes: true,
IntermediateStateNodes: true,
}
rpcSub, err := cli.Subscribe(context.Background(), statediff.APIName, stateDiffPayloadChan, methodName, params)
if err != nil {
// handle error
}
for {
select {
case stateDiffPayload := < - stateDiffPayloadChan:
// process the payload
case err := < - rpcSub . Err ( ) :
// handle rpc subscription error
}
}
```
#### Unary endpoints
The service also exposes unary RPC endpoints for retrieving the state diff `StateObject` for a specific block height/hash.
```go
// StateDiffAt returns a state diff payload at the specific blockheight
StateDiffAt(ctx context.Context, blockNumber uint64, params Params) (*Payload, error)
// StateDiffFor returns a state diff payload for the specific blockhash
StateDiffFor(ctx context.Context, blockHash common.Hash, params Params) (*Payload, error)
```
To expose this endpoint the node needs to have the HTTP server turned on (`--http`),
and the `statediff` namespace exposed (`--http.api=statediff`).
### Direct indexing into Postgres
If `--statediff.writing` is set, the service will convert the state diff `StateObject` data into IPLD objects, persist them directly to Postgres,
and generate secondary indexes around the IPLD data.
The schema and migrations for this Postgres database are provided in `statediff/db/` .
#### Postgres setup
We use [pressly/goose ](https://github.com/pressly/goose ) as our Postgres migration manager.
You can also load the Postgres schema directly into a database using
`psql database_name < schema.sql`
This will only work on a version 12.4 Postgres database.
#### Schema overview
Our Postgres schemas are built around a single IPFS backing Postgres IPLD blockstore table (`public.blocks`) that conforms with [go-ds-sql ](https://github.com/ipfs/go-ds-sql/blob/master/postgres/postgres.go ).
All IPLD objects are stored in this table, where `key` is the blockstore-prefixed multihash key for the IPLD object and `data` contains
the bytes for the IPLD block (in the case of all Ethereum IPLDs, this is the RLP byte encoding of the Ethereum object).
The IPLD objects in this table can be traversed using an IPLD DAG interface, but since this table only maps multihash to raw IPLD object
it is not particularly useful for searching through the data by looking up Ethereum objects by their constituent fields
(e.g. by block number, tx source/recipient, state/storage trie node path). To improve the accessibility of these objects
we create an Ethereum [advanced data layout ](https://github.com/ipld/specs#schemas-and-advanced-data-layouts ) (ADL) by generating secondary
indexes on top of the raw IPLDs in other Postgres tables.
These secondary index tables fall under the `eth` schema and follow an `{objectType}_cids` naming convention.
These tables provide a view into individual fields of the underlying Ethereum IPLD objects, allowing lookups on these fields, and reference the raw IPLD objects stored in `public.blocks`
by foreign keys to their multihash keys.
Additionally, these tables maintain the hash-linked nature of Ethereum objects to one another. E.g. a storage trie node entry in the `storage_cids`
table contains a `state_id` foreign key which references the `id` for the `state_cids` entry that contains the state leaf node for the contract that storage node belongs to,
and in turn that `state_cids` entry contains a `header_id` foreign key which references the `id` of the `header_cids` entry that contains the header for the block these state and storage nodes were updated (diffed).
### Optimization
On mainnet this process is extremely IO intensive and requires significant resources to allow it to keep up with the head of the chain.
The state diff processing time for a specific block is dependent on the number and complexity of the state changes that occur in a block and
the number of updated state nodes that are available in the in-memory cache vs must be retrieved from disc.
If memory permits, one means of improving the efficiency of this process is to increase the in-memory trie cache allocation.
This can be done by increasing the overall `--cache` allocation and/or by increasing the % of the cache allocated to trie
usage with `--cache.trie` .