Storage nodes are not being marked as removed when the contract is destroyed. #141
Labels
No Label
bug
critical
documentation
duplicate
enhancement
Epic
good first issue
help wanted
Integration tests
invalid
question
v5
wontfix
Copied from Github
Kind/Breaking
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cerc-io/ipld-eth-server#141
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Discovered by Ashwin and his team.
This causes a bug in ipld-eth-server:
If a contract deployed at some address
addr
that has a non-empty storage tries
is destroyed at some blockn
and then a contract (same or different) is redeployed ataddr
(using CREATE2 opcode) at some blockm
if we perform the query currently used in RetrieveStorageAtByAddressAndStorageSlotAndBlockHash at some blockm+x; where x>=0
it will behave as if the new contract inheriteds
as its initial state.In particular, if the specific storage leaf key being queried doesn't exist in the redeployed contract's storage trie at
m+x
but it did exist ins
, the query will incorrectly return the value held at that key ins
instead of an empty value.Fixes
During statediffing
This could be fixed on the indexing end, by adjusting the statediffing code to iterate
s
and insert a "removed" node-type entry toeth.storage_cids
at blockn
for every node in the trie when generating the statediff betweenn-1
andn
. This would be done here. Having to iterate the entire storage trie is potentially very expensive, and this wouldn't fix the data that has already been processed into Postgres.Adapt the query
Alternatively, the query used here could be adjusted to account for this issue.
was_state_leaf_ever_removed
) for checking if a node has ever been removed for a specificstate_leaf_key
, returning the blockheight(s) it was removed at if so.retrieve_storage_by_address_and_storage_slot_and_blockHash
) that first callswas_state_leaf_ever_removed
to check if the storage key was ever removed. If it wasn't, it proceeds with the current access pattern. If it was, it finds the height at which it was (most recently) removed and limits its query toeth.storage_cids
records above this block height.This would add additional overhead to the query. This also only fixes this symptom, not the underlying issue. This issue will produce a similar symptom when, for example, trying to create a new storage snapshot from a previous snapshot + all the interim diffs and we would require an adapted query for that as well.
Post-processing of data in Postgres
This issue could also be fixed in post-processing, the algorithm for this would look like:
eth.state_cids
records that correspond to a contract destructionThat gets you all removed leaf nodes and the block heights they were removed at, but that includes EOAs and contracts.
To figure out if the removed leafs contained contracts, you need to find the latest non-removed-type
entry for those
state_leaf_key
s and check if they contain a non-"zero" storage root ("zero" value of root iskeccak256(rlp([]byte{}))
).eth.storage_cids
record at everystorage_path
in the trie for those contracts$1 =
state_leaf_key
for removed contract, found in step 1$2 =
block_number
where the contract was destroyed, found step 1eth.storage_cids
records withnode_type = 3
for all of the records found in the above query.Proposal
Make the updates described in 1) to fix newly indexed data. Use the approach described in 3) to fix the existing data in Postgres.
These contract destructions aren't very frequent, so the performance impact on the head-tracking process should be minimal in aggregate but it could introduce significant latency in the processing of the specific statediffs where large contract destructions do occur. Due to the infrequency, the historical processing required would be sparse and with the index on
node_type
it should be fast to find the (relatively few) contracts we need to fix the issue for.Some notes from discussion today:
Ashwin and his team have identified the problem described here extends to the storage trie. To alleviate this, we need to introduce a new "moved" node type for representing, in our PostgresDB, leaf nodes that have been moved to a new path vs leaf nodes that have been removed altogether.
Tasks:
Closing this as we no longer need to post-process v4 as we are moving to v5, and the missing integration test scenario is no longer relevant in v5 since we don't index intermediate nodes so there would be no need to distinguish between "moved" vs "removed", although we do still need better testing in place for when accounts or slots are removed e.g. https://github.com/cerc-io/tx-spammer/issues/18