Statediffing mode for specific contract proofs #29

Closed
opened 2020-10-14 16:48:31 +00:00 by i-norden · 4 comments
Member

Mode for the statediffing service that returns a state object with only the nodes required for accessing and proving data in the specified contract(s)

Provided a set of contract addresses, we want to:

  • Iterate down state trie only along paths to leaf nodes for the provided contracts, collecting all of the nodes along this path (which are required for generating a proof against the state root hash)

  • Iterate through the entire storage tries for these contracts, producing complete statediff objects for the specified contracts only

  • Unit and integration tests

  • Support both ipld-eth-indexer endpoints and direct indexing from inside geth (should be able to use same database)

Mode for the statediffing service that returns a state object with only the nodes required for accessing and proving data in the specified contract(s) Provided a set of contract addresses, we want to: - [x] Iterate down state trie only along paths to leaf nodes for the provided contracts, collecting all of the nodes along this path (which are required for generating a proof against the state root hash) - [x] Iterate through the entire storage tries for these contracts, producing complete statediff objects for the specified contracts only - [x] Unit and integration tests - [x] Support both ipld-eth-indexer endpoints and direct indexing from inside geth (should be able to use same database)
Author
Member

I will double check this is satisfied

I will double check this is satisfied
Author
Member

@AFDudley @ashwinphatak I don't think this is currently satisfied, if we watch a set of addresses the builder only diffs leaf nodes: https://github.com/vulcanize/go-ethereum/blob/v1.10.18-statediff-4.0.2-alpha/statediff/builder.go#L173

And rather than just a potentially vestigial comment, here is the code:

Only in createdAndUpdatedState do we filter on the addresses https://github.com/vulcanize/go-ethereum/blob/v1.10.18-statediff-4.0.2-alpha/statediff/builder.go#L318 whereas in createdAndUpdatedStateWithIntermediateNodes we do not https://github.com/vulcanize/go-ethereum/blob/v1.10.18-statediff-4.0.2-alpha/statediff/builder.go#L363

Instead, what we want is that when we specify a list of watched addresses we can diff only those associated leaf nodes and all the intermediate nodes along the paths to them (required to prove them).

Seems there are two approaches to this:

  1. Hold onto the full stack of parent nodes- at most 63; in practice probably 6-10 on average- during traversal to a leaf, once we get to the leaf when we check whether or not it is in the WatchedAddress list we either throw away the parent node stack or not (the simple approach we could take with the current iterator)
  2. Rewrite/modify the iterator so that it only ever traverses down the paths on interest in the first place (the ideal but more complex to implement approach)

2 would be a huge boon to performance as we would avoid traversing portions of the difference trie that we aren't actually interested in (instead of waiting til we get to a leaf to find out we didn't need to walk all the way down to said leaf).

@AFDudley @ashwinphatak I don't think this is currently satisfied, if we watch a set of addresses the builder only diffs leaf nodes: https://github.com/vulcanize/go-ethereum/blob/v1.10.18-statediff-4.0.2-alpha/statediff/builder.go#L173 And rather than just a potentially vestigial comment, here is the code: Only in `createdAndUpdatedState` do we filter on the addresses https://github.com/vulcanize/go-ethereum/blob/v1.10.18-statediff-4.0.2-alpha/statediff/builder.go#L318 whereas in `createdAndUpdatedStateWithIntermediateNodes` we do not https://github.com/vulcanize/go-ethereum/blob/v1.10.18-statediff-4.0.2-alpha/statediff/builder.go#L363 Instead, what we want is that when we specify a list of watched addresses we can diff only those associated leaf nodes and all the intermediate nodes along the paths to them (required to prove them). Seems there are two approaches to this: 1) Hold onto the full stack of parent nodes- at most 63; in practice probably 6-10 on average- during traversal to a leaf, once we get to the leaf when we check whether or not it is in the `WatchedAddress` list we either throw away the parent node stack or not (the simple approach we could take with the current iterator) 2) Rewrite/modify the iterator so that it only ever traverses down the paths on interest in the first place (the ideal but more complex to implement approach) 2 would be a huge boon to performance as we would avoid traversing portions of the difference trie that we aren't actually interested in (instead of waiting til we get to a leaf to find out we didn't need to walk all the way down to said leaf).
prathamesh0 commented 2022-06-21 12:36:19 +00:00 (Migrated from github.com)

@i-norden The suggested approach in https://github.com/vulcanize/ipld-eth-state-snapshot/pull/46#issue-1272566458 to limit the trie traversal only to paths that lead to watched addresses requires the ability to restrict the descent down the trie when required.

In state trie iteration, this is allowed by the descend flag to the iterator.Next() method; whereas, in case of difference traversal done for statediffing, the difference iterator being used ignores the flag being passed to it's Next() method. So we can't readily avoid iterating the whole diff trie.

However, I think that checking paths as prefix for filtering out nodes is still a simpler approach than using a stack to hold on to intermediate nodes.

@i-norden The suggested approach in https://github.com/vulcanize/ipld-eth-state-snapshot/pull/46#issue-1272566458 to limit the trie traversal only to paths that lead to watched addresses requires the ability to restrict the descent down the trie when required. In state trie iteration, this is allowed by the `descend` flag to the [`iterator.Next()`](https://github.com/ethereum/go-ethereum/blob/master/trie/iterator.go#L258) method; whereas, in case of difference traversal done for statediffing, the difference iterator being used ignores the flag being passed to it's [`Next()`](https://github.com/ethereum/go-ethereum/blob/master/trie/iterator.go#L587) method. So we can't readily avoid iterating the whole diff trie. However, I think that checking paths as prefix for filtering out nodes is still a simpler approach than using a stack to hold on to intermediate nodes.
prathamesh0 commented 2022-06-28 05:33:10 +00:00 (Migrated from github.com)

Further optimization tracked in https://github.com/vulcanize/go-ethereum/issues/252

Further optimization tracked in https://github.com/vulcanize/go-ethereum/issues/252
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/go-ethereum#29
No description provided.