Add WriteStateSnapshot #15

Merged
roysc merged 23 commits from with-iterator-tracker into main 2023-09-28 03:35:47 +00:00
Member

Adds a method to perform full-state snapshots by diffing against an empty state trie.
This replicates the functionality of ipld-eth-state-snapshot, so that code can use this as a library; see: cerc-io/ipld-eth-state-snapshot#1

Note that due to how incremental diffs are processed (updates are processed after the
trie has been traversed) the iterator state doesn't fully capture the progress of the diff, so it's not currently feasible to track them this way. Full snapshots don't have to worry about updated accounts, so we can support them.

Adds a method to perform full-state snapshots by diffing against an empty state trie. This replicates the functionality of `ipld-eth-state-snapshot`, so that code can use this as a library; see: https://git.vdb.to/cerc-io/ipld-eth-state-snapshot/pulls/1 Note that due to how incremental diffs are processed (updates are processed after the trie has been traversed) the iterator state doesn't fully capture the progress of the diff, so it's not currently feasible to track them this way. Full snapshots don't have to worry about updated accounts, so we can support them.
roysc added 33 commits 2023-08-30 17:21:17 +00:00
tests: update ipld-eth-db image
Some checks failed
Test / Run unit tests (pull_request) Failing after 6s
Test / Run integration tests (pull_request) Failing after 34m57s
69fa11d6b1
- Remove unused bits
- simplify and normalize test data
- Renames for clarity
refactor builder metrics
Some checks failed
Test / Run unit tests (pull_request) Failing after 2m6s
Test / Run integration tests (pull_request) Failing after 2m12s
23204cb810
[wip] go mod udpate
Some checks failed
Test / Run unit tests (pull_request) Failing after 2m3s
Test / Run integration tests (pull_request) Failing after 2m11s
c2160fac83
roysc force-pushed with-iterator-tracker from c2160fac83 to 5ed22c03d2 2023-09-01 07:09:53 +00:00 Compare
roysc changed title from WIP: Add WriteStateDiffTracked to support tracking & recovery of a statediff to Add WriteStateDiffTracked to support tracking & recovery of a statediff 2023-09-01 20:50:57 +00:00
roysc requested review from i-norden 2023-09-01 20:53:13 +00:00
telackey requested review from telackey 2023-09-06 21:25:42 +00:00
roysc force-pushed with-iterator-tracker from 5ed22c03d2 to 77e00278e4 2023-09-25 19:14:12 +00:00 Compare
telackey requested changes 2023-09-26 01:49:53 +00:00
telackey left a comment
Member

For such a massive PR, I really don't have a ton of comments or questions. I was befuddled on a couple of points though which I highlighted.

For such a massive PR, I really don't have a ton of comments or questions. I was befuddled on a couple of points though which I highlighted.
@ -93,4 +93,2 @@
}
t = time.Now()
blockTx := &BatchTx{
Member

Cleaning this part of the code up is a big win.

Cleaning this part of the code up is a big win.
roysc marked this conversation as resolved
@ -48,7 +47,7 @@ func SetupTestData(t *testing.T, ind interfaces.StateDiffIndexer) {
t.Fatal(err)
}
defer func() {
Member

Do you recall why we passed err in to Submit() before? I have a vague memory buzzing in the back of my mind and I want to double check that there is no longer any reason.

Do you recall why we passed err in to Submit() before? I have a vague memory buzzing in the back of my mind and I want to double check that there is no longer any reason.
Author
Member

Basically it was due to handling rollbacks and commit in the submit function, so it needed the error to check whether to rollback. By factoring the rollback out, we avoid that, but now the caller is responsible for it.

Which reminds me that I need to defer a rollback call from the scope of service.writeStateDiff - since the tx is returned from PushBlock.

Basically it was due to handling rollbacks and commit in the submit function, so it needed the error to check whether to rollback. By factoring the rollback out, we avoid that, but now the caller is responsible for it. Which reminds me that I need to defer a rollback call from the scope of `service.writeStateDiff` - since the tx is returned from `PushBlock`.
Member

If you've added the rollback, feel free to resolve this conversation.

If you've added the rollback, feel free to resolve this conversation.
roysc marked this conversation as resolved
@ -36,0 +21,4 @@
// SymmDiffAux exposes state specific to symmetric difference iteration, which is not accessible
// from the NodeIterator interface. This includes the number of nodes seen, whether the current key
// is common to both A and B, and whether the current node is sourced from A or B.
type SymmDiffAux struct {
Member

What is the advantage/disadvantage of doing this by a sort of "sidecar" type vs extending the interface or type.

Eg, something more traditional like:

type SymDiffNodeIterator interface {
    NodeIterator
    NextComesFromA() bool 
    Position() int
    LastEqPathIndex() int
}

Just as a rule, having one type representing the internal state of another type, and passing them both around, strikes me as odd, but there may be a compelling reason to do it that way.

What is the advantage/disadvantage of doing this by a sort of "sidecar" type vs extending the interface or type. Eg, something more traditional like: ``` type SymDiffNodeIterator interface { NodeIterator NextComesFromA() bool Position() int LastEqPathIndex() int } ``` Just as a rule, having one type representing the internal state of another type, and passing them both around, strikes me as odd, but there may be a compelling reason to do it that way.
Author
Member

Definitely, it's ugly. I didn't want to do this, but the problem comes when we wrap the iterators for tracking. We need to use the wrapped iterator in order for tracking to work, but we need the state exposed by the underlying one. We could extract the base with casts, but that's not much better and leaks the abstraction.

Other workarounds would be to

  1. return the base iterators from tracker.Restore along with the wrapped ones, or
  2. create a new tracker.Iterator interface which exposes the base somehow.

2 means new types and extra state that's mostly not used. 1 is okay, though from this end the result looks about the same (need to pass an extra object).

Definitely, it's ugly. I didn't want to do this, but the problem comes when we wrap the iterators for tracking. We need to use the wrapped iterator in order for tracking to work, but we need the state exposed by the underlying one. We could extract the base with casts, but that's not much better and leaks the abstraction. Other workarounds would be to 1. return the base iterators from `tracker.Restore` along with the wrapped ones, or 2. create a new `tracker.Iterator` interface which exposes the base somehow. 2 means new types and extra state that's mostly not used. 1 is okay, though from this end the result looks about the same (need to pass an extra object).
Author
Member

I've done option 1, but kept the separate state type, so it's clear that only a subset of the symm-diff iterator state is actually used in that context - rather than just passing the same object as a different type.

I've done option 1, but kept the separate state type, so it's clear that only a subset of the symm-diff iterator state is actually used in that context - rather than just passing the same object as a different type.
telackey marked this conversation as resolved
roysc added 2 commits 2023-09-26 10:12:20 +00:00
rework iterator state stuff
Some checks failed
Test / Run unit tests (pull_request) Failing after 15m46s
Test / Run integration tests (pull_request) Successful in 30m5s
100c106ee5
roysc requested review from telackey 2023-09-26 10:17:22 +00:00
roysc force-pushed with-iterator-tracker from 100c106ee5 to 367349b85a 2023-09-26 12:12:47 +00:00 Compare
roysc added 1 commit 2023-09-27 13:29:33 +00:00
refactor: only support tracked snapshots
All checks were successful
Test / Run unit tests (pull_request) Successful in 16m19s
Test / Run integration tests (pull_request) Successful in 32m16s
c2ceca9230
tracking incremental state diffs is not feasible like this. because updates are processed after the
trie has been traversed, the iterators' state doesn't fully capture the progress of the diff. since
currently snapshots are the only use case for tracking, let's just support that.
roysc added 6 commits 2023-09-27 13:58:48 +00:00
Run compliance tests.
Some checks failed
Test / Run integration tests (pull_request) Has been cancelled
Test / Run unit tests (pull_request) Has been cancelled
Test / Run compliance tests (pull_request) Failing after 14s
c37aab8b69
Tweak path
Some checks reported warnings
Test / Run integration tests (pull_request) Has been cancelled
Test / Run unit tests (pull_request) Has been cancelled
Test / Run compliance tests (pull_request) Has been cancelled
dfa0660317
jq
All checks were successful
Test / Run unit tests (pull_request) Successful in 16m46s
Test / Run compliance tests (pull_request) Successful in 4m16s
Test / Run integration tests (pull_request) Successful in 30m51s
c8e7acaacd
Rm branch spec
Some checks failed
Test / Run integration tests (pull_request) Has been cancelled
Test / Run unit tests (pull_request) Has been cancelled
Test / Run compliance tests (pull_request) Failing after 42s
101654f30e
Use v0.1.0
All checks were successful
Test / Run unit tests (pull_request) Successful in 16m35s
Test / Run integration tests (pull_request) Successful in 29m30s
Test / Run compliance tests (pull_request) Successful in 4m7s
935c2ce4b5
Merge pull request 'Run compliance tests.' (#16) from telackey/cict into with-iterator-tracker
All checks were successful
Test / Run unit tests (pull_request) Successful in 17m50s
Test / Run compliance tests (pull_request) Successful in 3m19s
Test / Run integration tests (pull_request) Successful in 31m34s
e668a73bbe
Reviewed-on: #16
roysc changed title from Add WriteStateDiffTracked to support tracking & recovery of a statediff to Add WriteStateSnapshot 2023-09-27 14:24:13 +00:00
telackey reviewed 2023-09-27 22:56:04 +00:00
telackey approved these changes 2023-09-27 23:08:52 +00:00
telackey left a comment
Member

Looks great! So much cleaner.

Looks great! So much cleaner.
@ -163,0 +174,4 @@
return fmt.Errorf("error opening new state trie: %w", err)
}
subiters, _, err := tracker.Restore(tree.NodeIterator)
Member

I didn't see a test case where the iterators are actually restored. Is there one? Perhaps it exists in ipld-eth-state-snapshot?

I didn't see a test case where the iterators are actually restored. Is there one? Perhaps it exists in ipld-eth-state-snapshot?
Author
Member

It happens in the TestBuilderSnapshot tests, via test_helpers.RunStateSnapshot. Although there's no explicit check for the file in these tests, I do see them getting restored.

It happens in the `TestBuilderSnapshot` tests, via `test_helpers.RunStateSnapshot`. Although there's no explicit check for the file in these tests, I do see them getting restored.
Author
Member

Added a check for the file

Added a check for the file
Member

Great! I see it now. I read that test, but I didn't put 2 and 2 together regarding the interrupt.

Great! I see it now. I read that test, but I didn't put 2 and 2 together regarding the interrupt.
Member

Yeah, a check for the file is even better, but you are right, practically speaking even the previous version of the test couldn't have succeeded if the file were not created and used.

Yeah, a check for the file is even better, but you are right, practically speaking even the previous version of the test couldn't have succeeded if the file were not created and used.
telackey marked this conversation as resolved
@ -236,0 +169,4 @@
func (sdi *StateDiffIndexer) PushHeader(batch interfaces.Batch, header *types.Header, reward, td *big.Int) (string, error) {
tx, ok := batch.(*BatchTx)
if !ok {
return "", fmt.Errorf("sql: batch is expected to be of type %T, got %T", &BatchTx{}, batch)
Member

Whoa, does this happen? Why?

Whoa, does this happen? Why?
Author
Member

Haha, it shouldn't and hasn't, but it's technically possible for someone to create an indexer of a different type and pass the wrong batch in. That would just panic otherwise, so the error is more to document the fact that this batch type is covariant with the indexer.

Haha, it shouldn't and hasn't, but it's technically possible for someone to create an indexer of a different type and pass the wrong batch in. That would just panic otherwise, so the error is more to document the fact that this batch type is covariant with the indexer.
Member

Oh good. Absolutely best always to check, I was just shocked for a moment to think it might be something that really happens.

Oh good. Absolutely best always to check, I was just shocked for a moment to think it might be something that really happens.
telackey marked this conversation as resolved
roysc added 1 commit 2023-09-28 01:01:07 +00:00
check for recovery file
All checks were successful
Test / Run compliance tests (pull_request) Successful in 3m47s
Test / Run unit tests (pull_request) Successful in 15m27s
Test / Run integration tests (pull_request) Successful in 28m35s
aeffc86033
roysc merged commit b8fec4b571 into main 2023-09-28 03:35:47 +00:00
roysc deleted branch with-iterator-tracker 2023-09-28 03:35:47 +00:00
roysc referenced this issue from a commit 2023-09-28 03:35:49 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/plugeth-statediff#15
No description provided.