Storage nodes are not being marked as removed when the contract is destroyed. #141

Closed
opened 2022-03-10 03:27:44 +00:00 by i-norden · 3 comments
Member

Discovered by Ashwin and his team.

This causes a bug in ipld-eth-server:

If a contract deployed at some address addr that has a non-empty storage trie s is destroyed at some block n and then a contract (same or different) is redeployed at addr (using CREATE2 opcode) at some block m if we perform the query currently used in RetrieveStorageAtByAddressAndStorageSlotAndBlockHash at some block m+x; where x>=0 it will behave as if the new contract inherited s as its initial state.

In particular, if the specific storage leaf key being queried doesn't exist in the redeployed contract's storage trie at m+x but it did exist in s, the query will incorrectly return the value held at that key in s instead of an empty value.

Fixes

During statediffing

This could be fixed on the indexing end, by adjusting the statediffing code to iterate s and insert a "removed" node-type entry to eth.storage_cids at block n for every node in the trie when generating the statediff between n-1 and n. This would be done here. Having to iterate the entire storage trie is potentially very expensive, and this wouldn't fix the data that has already been processed into Postgres.

Adapt the query

Alternatively, the query used here could be adjusted to account for this issue.

  1. Create a stored function (e.g. was_state_leaf_ever_removed) for checking if a node has ever been removed for a specific state_leaf_key, returning the blockheight(s) it was removed at if so.
  2. Create another stored function (e.g. retrieve_storage_by_address_and_storage_slot_and_blockHash) that first calls was_state_leaf_ever_removed to check if the storage key was ever removed. If it wasn't, it proceeds with the current access pattern. If it was, it finds the height at which it was (most recently) removed and limits its query to eth.storage_cids records above this block height.

This would add additional overhead to the query. This also only fixes this symptom, not the underlying issue. This issue will produce a similar symptom when, for example, trying to create a new storage snapshot from a previous snapshot + all the interim diffs and we would require an adapted query for that as well.

Post-processing of data in Postgres

This issue could also be fixed in post-processing, the algorithm for this would look like:

  1. Query the database to find "Removed" node-type eth.state_cids records that correspond to a contract destruction
SELECT state_cids.*, block_number
FROM eth.state_cids
INNER JOIN eth.header_cids
ON (state_cids.header_id = header_cids.block_hash)
WHERE node_type = 3
AND state_leaf_key != '';

That gets you all removed leaf nodes and the block heights they were removed at, but that includes EOAs and contracts.

To figure out if the removed leafs contained contracts, you need to find the latest non-removed-type
entry for those state_leaf_keys and check if they contain a non-"zero" storage root ("zero" value of root is keccak256(rlp([]byte{}))).

  1. Find the latest eth.storage_cids record at every storage_path in the trie for those contracts
SELECT DISTINCT ON (storage_path) storage_cids.*
FROM eth.storage_cids 
INNER JOIN eth.state_cids ON (storage_cids.state_path = state_cids.state_path) AND (storage_cids.header_id = state_cids.header_id)
INNER JOIN eth.header_cids ON (state_cids.header_id = header_cids.block_hash)
WHERE state_cids.state_leaf_key = $1
WHERE storage_cids.node_type != 3 # can ignore nodes whose latest state is already "removed"
AND block_number < $2
ORDER BY storage_path, block_number DESC;

$1 = state_leaf_key for removed contract, found in step 1
$2 = block_number where the contract was destroyed, found step 1

  1. At the blockheight the corresponding contract was destroyed, create new eth.storage_cids records with node_type = 3 for all of the records found in the above query.

Proposal

Make the updates described in 1) to fix newly indexed data. Use the approach described in 3) to fix the existing data in Postgres.

These contract destructions aren't very frequent, so the performance impact on the head-tracking process should be minimal in aggregate but it could introduce significant latency in the processing of the specific statediffs where large contract destructions do occur. Due to the infrequency, the historical processing required would be sparse and with the index on node_type it should be fast to find the (relatively few) contracts we need to fix the issue for.

Discovered by Ashwin and his team. This causes a bug in ipld-eth-server: If a contract deployed at some address `addr` that has a non-empty storage trie `s` is destroyed at some block `n` and then a contract (same or different) is redeployed at `addr` (using CREATE2 opcode) at some block `m` if we perform the [query](https://github.com/vulcanize/ipld-eth-server/blob/5d86b6e029ce04b0c0fc6295bf0ec553c4c39f1e/pkg/eth/ipld_retriever.go#L140) currently used in [RetrieveStorageAtByAddressAndStorageSlotAndBlockHash](https://github.com/vulcanize/ipld-eth-server/blob/5d86b6e029ce04b0c0fc6295bf0ec553c4c39f1e/pkg/eth/ipld_retriever.go#L483) at some block `m+x; where x>=0` it will behave as if the new contract inherited `s` as its initial state. In particular, if the specific storage leaf key being queried doesn't exist in the redeployed contract's storage trie at `m+x` but it did exist in `s`, the query will incorrectly return the value held at that key in `s` instead of an empty value. ## Fixes ### During statediffing This could be fixed on the indexing end, by adjusting the statediffing code to iterate `s` and insert a "removed" node-type entry to `eth.storage_cids` at block `n` for *every* node in the trie when generating the statediff between `n-1` and `n`. This would be done [here](https://github.com/vulcanize/go-ethereum/blob/v1.10.16-statediff-v3/statediff/builder.go#L402). Having to iterate the entire storage trie is potentially very expensive, and this wouldn't fix the data that has already been processed into Postgres. ### Adapt the query Alternatively, the query used here could be adjusted to account for this issue. 1. Create a stored function (e.g. `was_state_leaf_ever_removed`) for checking if a node has ever been removed for a specific `state_leaf_key`, returning the blockheight(s) it was removed at if so. 2. Create another stored function (e.g. `retrieve_storage_by_address_and_storage_slot_and_blockHash`) that first calls `was_state_leaf_ever_removed` to check if the storage key was ever removed. If it wasn't, it proceeds with the current access pattern. If it was, it finds the height at which it was (most recently) removed and limits its query to `eth.storage_cids` records above this block height. This would add additional overhead to the query. This also only fixes this symptom, not the underlying issue. This issue will produce a similar symptom when, for example, trying to create a new storage snapshot from a previous snapshot + all the interim diffs and we would require an adapted query for that as well. ### Post-processing of data in Postgres This issue could also be fixed in post-processing, the algorithm for this would look like: 1. Query the database to find "Removed" node-type `eth.state_cids` records that correspond to a contract destruction ```sql SELECT state_cids.*, block_number FROM eth.state_cids INNER JOIN eth.header_cids ON (state_cids.header_id = header_cids.block_hash) WHERE node_type = 3 AND state_leaf_key != ''; ``` That gets you all removed leaf nodes and the block heights they were removed at, but that includes EOAs and contracts. To figure out if the removed leafs contained contracts, you need to find the latest non-removed-type entry for those `state_leaf_key`s and check if they contain a non-"zero" storage root ("zero" value of root is `keccak256(rlp([]byte{}))`). 2. Find the latest `eth.storage_cids` record at every `storage_path` in the trie for those contracts ```sql SELECT DISTINCT ON (storage_path) storage_cids.* FROM eth.storage_cids INNER JOIN eth.state_cids ON (storage_cids.state_path = state_cids.state_path) AND (storage_cids.header_id = state_cids.header_id) INNER JOIN eth.header_cids ON (state_cids.header_id = header_cids.block_hash) WHERE state_cids.state_leaf_key = $1 WHERE storage_cids.node_type != 3 # can ignore nodes whose latest state is already "removed" AND block_number < $2 ORDER BY storage_path, block_number DESC; ``` $1 = `state_leaf_key` for removed contract, found in step 1 $2 = `block_number` where the contract was destroyed, found step 1 3. At the blockheight the corresponding contract was destroyed, create new `eth.storage_cids` records with `node_type = 3` for all of the records found in the above query. ## Proposal Make the updates described in 1) to fix newly indexed data. Use the approach described in 3) to fix the existing data in Postgres. These contract destructions aren't very frequent, so the performance impact on the head-tracking process should be minimal in aggregate but it could introduce significant latency in the processing of the specific statediffs where large contract destructions do occur. Due to the infrequency, the historical processing required would be sparse and with the index on `node_type` it should be fast to find the (relatively few) contracts we need to fix the issue for.
Author
Member

Some notes from discussion today:

Ashwin and his team have identified the problem described here extends to the storage trie. To alleviate this, we need to introduce a new "moved" node type for representing, in our PostgresDB, leaf nodes that have been moved to a new path vs leaf nodes that have been removed altogether.

Some notes from discussion today: Ashwin and his team have identified the problem described [here](https://github.com/vulcanize/go-ethereum/pull/210#discussion_r831633102) extends to the storage trie. To alleviate this, we need to introduce a new "moved" node type for representing, in our PostgresDB, leaf nodes that have been moved to a new path vs leaf nodes that have been removed altogether.
prathamesh0 commented 2022-05-24 11:14:00 +00:00 (Migrated from github.com)

Tasks:

  • Fixes in the statediffing code
    • Implement changes
    • Manual testing
    • Unit tests
    • Integration tests for scenario described in comment
  • Post-processing of existing data in Postgres
Tasks: - [ ] Fixes in the statediffing code - [x] Implement changes - [x] Manual testing - [x] Unit tests - [ ] Integration tests for scenario described in [comment](https://github.com/vulcanize/go-ethereum/pull/210#discussion_r831633102) - [ ] Post-processing of existing data in Postgres
Author
Member

Closing this as we no longer need to post-process v4 as we are moving to v5, and the missing integration test scenario is no longer relevant in v5 since we don't index intermediate nodes so there would be no need to distinguish between "moved" vs "removed", although we do still need better testing in place for when accounts or slots are removed e.g. https://github.com/cerc-io/tx-spammer/issues/18

Closing this as we no longer need to post-process v4 as we are moving to v5, and the missing integration test scenario is no longer relevant in v5 since we don't index intermediate nodes so there would be no need to distinguish between "moved" vs "removed", although we do still need better testing in place for when accounts or slots are removed e.g. https://github.com/cerc-io/tx-spammer/issues/18
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/ipld-eth-server#141
No description provided.