Storage nodes are not being marked as removed when the contract is destroyed. #141
Labels
No Label
bug
critical
documentation
duplicate
enhancement
Epic
good first issue
help wanted
Integration tests
invalid
question
v5
wontfix
Copied from Github
Kind/Breaking
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cerc-io/ipld-eth-server#141
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Discovered by Ashwin and his team.
This causes a bug in ipld-eth-server:
If a contract deployed at some address
addrthat has a non-empty storage triesis destroyed at some blocknand then a contract (same or different) is redeployed ataddr(using CREATE2 opcode) at some blockmif we perform the query currently used in RetrieveStorageAtByAddressAndStorageSlotAndBlockHash at some blockm+x; where x>=0it will behave as if the new contract inheritedsas its initial state.In particular, if the specific storage leaf key being queried doesn't exist in the redeployed contract's storage trie at
m+xbut it did exist ins, the query will incorrectly return the value held at that key insinstead of an empty value.Fixes
During statediffing
This could be fixed on the indexing end, by adjusting the statediffing code to iterate
sand insert a "removed" node-type entry toeth.storage_cidsat blocknfor every node in the trie when generating the statediff betweenn-1andn. This would be done here. Having to iterate the entire storage trie is potentially very expensive, and this wouldn't fix the data that has already been processed into Postgres.Adapt the query
Alternatively, the query used here could be adjusted to account for this issue.
was_state_leaf_ever_removed) for checking if a node has ever been removed for a specificstate_leaf_key, returning the blockheight(s) it was removed at if so.retrieve_storage_by_address_and_storage_slot_and_blockHash) that first callswas_state_leaf_ever_removedto check if the storage key was ever removed. If it wasn't, it proceeds with the current access pattern. If it was, it finds the height at which it was (most recently) removed and limits its query toeth.storage_cidsrecords above this block height.This would add additional overhead to the query. This also only fixes this symptom, not the underlying issue. This issue will produce a similar symptom when, for example, trying to create a new storage snapshot from a previous snapshot + all the interim diffs and we would require an adapted query for that as well.
Post-processing of data in Postgres
This issue could also be fixed in post-processing, the algorithm for this would look like:
eth.state_cidsrecords that correspond to a contract destructionThat gets you all removed leaf nodes and the block heights they were removed at, but that includes EOAs and contracts.
To figure out if the removed leafs contained contracts, you need to find the latest non-removed-type
entry for those
state_leaf_keys and check if they contain a non-"zero" storage root ("zero" value of root iskeccak256(rlp([]byte{}))).eth.storage_cidsrecord at everystorage_pathin the trie for those contracts$1 =
state_leaf_keyfor removed contract, found in step 1$2 =
block_numberwhere the contract was destroyed, found step 1eth.storage_cidsrecords withnode_type = 3for all of the records found in the above query.Proposal
Make the updates described in 1) to fix newly indexed data. Use the approach described in 3) to fix the existing data in Postgres.
These contract destructions aren't very frequent, so the performance impact on the head-tracking process should be minimal in aggregate but it could introduce significant latency in the processing of the specific statediffs where large contract destructions do occur. Due to the infrequency, the historical processing required would be sparse and with the index on
node_typeit should be fast to find the (relatively few) contracts we need to fix the issue for.Some notes from discussion today:
Ashwin and his team have identified the problem described here extends to the storage trie. To alleviate this, we need to introduce a new "moved" node type for representing, in our PostgresDB, leaf nodes that have been moved to a new path vs leaf nodes that have been removed altogether.
Tasks:
Closing this as we no longer need to post-process v4 as we are moving to v5, and the missing integration test scenario is no longer relevant in v5 since we don't index intermediate nodes so there would be no need to distinguish between "moved" vs "removed", although we do still need better testing in place for when accounts or slots are removed e.g. https://github.com/cerc-io/tx-spammer/issues/18