Statediff for full node #6

Merged
elizabethengelman merged 9 commits from statediff-for-full-node into statediff-for-archive-node 2019-02-21 20:36:04 +00:00
elizabethengelman commented 2019-02-11 23:11:31 +00:00 (Migrated from github.com)
  • This PR addresses several issues with the initial statediff approach:

    • When opening a trie based on it's root hash, I was previously looking at an empty in-memory database 🤦🏻‍♀️ which didn't affect the process with an archive node because it could look the information up from the disk database. Now we're looking at the blockchain's statecache, which is the correct in-mem database instance: bf0df08b34

    • After the previous point was fixed, there was still the issue where some necessary trie roots were being pruned before we had a chance to process their state diffs. This was addressed by keeping a collection of state roots that have been state diffed, and only allowing them to be pruned once their state diff and their child's state diff has been calculated: 2ac8afab11

  • It's also worth noting that for now we're not looking up an address by it's storage path, because this proved to be pretty difficult on a full node (there's a task in Jira to investigate this further). Since for Maker we already have the contract addresses, we're able keccak hash them, which gives us the account leaf key (which we're storing in the csv instead for now).

  • Here are some example diffs.

  • Aside from storing the account leaf key instead of the account address in the csv, all of these changes should the archive node statediffing.

- This PR addresses several issues with the initial statediff approach: - When opening a trie based on it's root hash, I was previously looking at an empty in-memory database 🤦🏻‍♀️ which didn't affect the process with an archive node because it could look the information up from the disk database. Now we're looking at the blockchain's statecache, which is the correct in-mem database instance: https://github.com/vulcanize/go-ethereum/pull/6/commits/bf0df08b3403c5603a56dbeca430d244fd202bcd - After the previous point was fixed, there was still the issue where some necessary trie roots were being pruned before we had a chance to process their state diffs. This was addressed by keeping a collection of state roots that have been state diffed, and only allowing them to be pruned once their state diff and their child's state diff has been calculated: https://github.com/vulcanize/go-ethereum/pull/6/commits/2ac8afab11aa260135bd9d48d69b835f7f54627d - It's also worth noting that for now we're not looking up an address by it's storage path, because this proved to be pretty difficult on a full node (there's a [task in Jira](https://makerdao.atlassian.net/browse/VDB-362) to investigate this further). Since for Maker we already have the contract addresses, we're able keccak hash them, which gives us the account leaf key (which we're storing in the csv instead for now). - Here are some [example diffs](https://github.com/8thlight/maker-vulcanizedb/pull/133/commits/d6129770b173cafc9935aaabd5fdbe260781ad0b). - Aside from storing the account leaf key instead of the account address in the csv, all of these changes should the archive node statediffing.
TakaGoto (Migrated from github.com) reviewed 2019-02-11 23:11:31 +00:00
yaoandrew (Migrated from github.com) reviewed 2019-02-11 23:11:31 +00:00
rmulhol (Migrated from github.com) approved these changes 2019-02-14 16:57:58 +00:00
rmulhol (Migrated from github.com) left a comment

LGTM!

One thing I'm wondering is if it makes sense to have a branch of this that only executes on a full, non-archive node? Probably not super high priority, but thinking that it might be an easier sell upstream if the process is only supplementing a way of running the node that doesn't already have the full state.

LGTM! One thing I'm wondering is if it makes sense to have a branch of this that _only_ executes on a full, non-archive node? Probably not super high priority, but thinking that it might be an easier sell upstream if the process is only supplementing a way of running the node that doesn't already have the full state.
@ -158,2 +159,4 @@
utils.RegisterEthService(stack, &cfg.Eth)
if ctx.GlobalBool(utils.StateDiffFlag.Name) {
cfg.Eth.StateDiff = true
rmulhol (Migrated from github.com) commented 2019-02-12 22:39:39 +00:00

would it make sense to combine this with the RegisterStateDiffService call, if both are based off of the call to ctx.GlobalBool(utils.StateDiffFlag.Name)?

would it make sense to combine this with the `RegisterStateDiffService` call, if both are based off of the call to `ctx.GlobalBool(utils.StateDiffFlag.Name`)?
@ -997,0 +1009,4 @@
if !bc.allowedRootToBeDereferenced(root.(common.Hash)) {
bc.triegc.Push(root, number)
break
} else {
rmulhol (Migrated from github.com) commented 2019-02-14 16:49:22 +00:00

Maybe worth putting 1 and 2 behind variables/constants to clarify what's going on here? My understanding is that a hash needs to be added once as the current hash, and once as the parent hash - and then we're done. Is that right? Do reorgs potentially complicate that expectation?

Maybe worth putting `1` and `2` behind variables/constants to clarify what's going on here? My understanding is that a hash needs to be added once as the current hash, and once as the parent hash - and then we're done. Is that right? Do reorgs potentially complicate that expectation?
rmulhol (Migrated from github.com) commented 2019-02-14 16:52:01 +00:00

Do we want to retain all these logging statements when we merge?

Do we want to retain all these logging statements when we merge?
i-norden approved these changes 2019-02-14 19:43:03 +00:00
i-norden left a comment
Member

Looks good to me! It's awesome seeing this come together, thank you Elizabeth!!

Looks good to me! It's awesome seeing this come together, thank you Elizabeth!!
Member

Does this error need to be handled in any additional capacity? Right now it still falls through to the ok check.

Does this error need to be handled in any additional capacity? Right now it still falls through to the `ok` check.
elizabethengelman (Migrated from github.com) reviewed 2019-02-14 21:45:08 +00:00
@ -158,2 +159,4 @@
utils.RegisterEthService(stack, &cfg.Eth)
if ctx.GlobalBool(utils.StateDiffFlag.Name) {
cfg.Eth.StateDiff = true
elizabethengelman (Migrated from github.com) commented 2019-02-14 21:45:08 +00:00

yeah, good call!

yeah, good call!
elizabethengelman (Migrated from github.com) reviewed 2019-02-14 21:51:13 +00:00
@ -997,0 +1009,4 @@
if !bc.allowedRootToBeDereferenced(root.(common.Hash)) {
bc.triegc.Push(root, number)
break
} else {
elizabethengelman (Migrated from github.com) commented 2019-02-14 21:51:13 +00:00

probably a good idea to remove some of them 👍

probably a good idea to remove some of them 👍
elizabethengelman (Migrated from github.com) reviewed 2019-02-14 22:48:01 +00:00
@ -997,0 +1009,4 @@
if !bc.allowedRootToBeDereferenced(root.(common.Hash)) {
bc.triegc.Push(root, number)
break
} else {
elizabethengelman (Migrated from github.com) commented 2019-02-14 22:48:01 +00:00

That's a really good catch. Yeah, my intention was that when we processes a state diff for a given hash, it gets added to the stateDiffsProcessed collection. It then gets added again when the state diff is processed when it is the parent, so we'll need to make sure it isn't pruned before these diffs have been processed.

Considering reorgs is a really good thought, I'm honestly not sure how that would affect this expectation. I think I'd need to spend some time digging into this - do you think it makes sense to hold off merging this in? Or, merging it, and creating a new story to take a look in the future?

That's a really good catch. Yeah, my intention was that when we processes a state diff for a given hash, it gets added to the `stateDiffsProcessed` collection. It then gets added again when the state diff is processed when it is the parent, so we'll need to make sure it isn't pruned before these diffs have been processed. Considering reorgs is a really good thought, I'm honestly not sure how that would affect this expectation. I think I'd need to spend some time digging into this - do you think it makes sense to hold off merging this in? Or, merging it, and creating a new story to take a look in the future?
elizabethengelman (Migrated from github.com) reviewed 2019-02-14 22:58:19 +00:00
elizabethengelman (Migrated from github.com) commented 2019-02-14 22:58:19 +00:00

Nice catch. This log statement was more for me as informational to see if AddToStateDiffProcessedCollection would be called more than twice (once when the hash was current, and once when it was the parent). I don't think there is an if its called more than twice, it would mean that a state diff was processed using this hash more than twice. So, I may remove this if statement unless there's some an edge case that I'm missing.

Nice catch. This log statement was more for me as informational to see if `AddToStateDiffProcessedCollection` would be called more than twice (once when the hash was current, and once when it was the parent). I don't think there is an if its called more than twice, it would mean that a state diff was processed using this hash more than twice. So, I may remove this if statement unless there's some an edge case that I'm missing.
elizabethengelman (Migrated from github.com) reviewed 2019-02-14 23:08:51 +00:00
elizabethengelman (Migrated from github.com) commented 2019-02-14 23:08:51 +00:00

I also wanted to note that I added a sleep in here, to get the service test suite passing, which isn't ideal. I have a suspicion that this is an indication that something else may be going on in the service loop, so @rmulhol or @i-norden if you have a moment to take another look at it, that would be awesome. Or, perhaps we can walk through it together - it's a lot to keep in one's head! 🤯

I also wanted to note that I added a sleep in here, to get the service test suite passing, which isn't ideal. I have a suspicion that this is an indication that something else may be going on in the [service loop](https://github.com/vulcanize/go-ethereum/blob/statediff-for-archive-node/statediff/service/service.go#L52), so @rmulhol or @i-norden if you have a moment to take another look at it, that would be awesome. Or, perhaps we can walk through it together - it's a lot to keep in one's head! 🤯
rmulhol (Migrated from github.com) reviewed 2019-02-15 00:29:52 +00:00
rmulhol (Migrated from github.com) commented 2019-02-15 00:29:51 +00:00

Would definitely take you up on walking through it together - you've definitely got a better grasp of the larger project architecture than me!

That said I have no strong objection to including the sleep in a mock. My only question would be whether the need for a sleep here implies that actual asynchronous execution might run into a similar problem as whatever was causing the test to fail.

Would definitely take you up on walking through it together - you've definitely got a better grasp of the larger project architecture than me! That said I have no strong objection to including the sleep in a mock. My only question would be whether the need for a sleep here implies that actual asynchronous execution might run into a similar problem as whatever was causing the test to fail.
rmulhol (Migrated from github.com) reviewed 2019-02-15 00:31:10 +00:00
@ -997,0 +1009,4 @@
if !bc.allowedRootToBeDereferenced(root.(common.Hash)) {
bc.triegc.Push(root, number)
break
} else {
rmulhol (Migrated from github.com) commented 2019-02-15 00:31:09 +00:00

Definitely sounds like another story to me, and a bit hard to simulate without getting to the head of the chain and watching it execute in the middle of one.

Definitely sounds like another story to me, and a bit hard to simulate without getting to the head of the chain and watching it execute in the middle of one.
elizabethengelman (Migrated from github.com) reviewed 2019-02-15 15:12:06 +00:00
@ -997,0 +1009,4 @@
if !bc.allowedRootToBeDereferenced(root.(common.Hash)) {
bc.triegc.Push(root, number)
break
} else {
elizabethengelman (Migrated from github.com) commented 2019-02-15 15:12:05 +00:00

Yeah, it definitely will be tricky. I've added another story so we can make sure to circle back: https://makerdao.atlassian.net/browse/VDB-393

Yeah, it definitely will be tricky. I've added another story so we can make sure to circle back: https://makerdao.atlassian.net/browse/VDB-393
m0ar (Migrated from github.com) approved these changes 2019-02-19 13:22:59 +00:00
m0ar (Migrated from github.com) left a comment

Awesome job @elizabethengelman !! 🚀

Missing a bit of context of the code base obviously, but I noticed a few things that I've commented.

Awesome job @elizabethengelman !! :rocket: Missing a bit of context of the code base obviously, but I noticed a few things that I've commented.
m0ar (Migrated from github.com) commented 2019-02-19 12:37:11 +00:00

What exactly does this flag entail? Hint hint, there are comments on the rest of them ;)

What _exactly_ does this flag entail? Hint hint, there are comments on the rest of them ;)
m0ar (Migrated from github.com) commented 2019-02-19 12:50:03 +00:00

This is equivalent:

count, _ := bc.stateDiffsProcessed[hash]
bc.stateDiffsProcessed[hash] = count + 1

If !ok, count will have the zero value of int: 0 anyway. I assume this is run very frequently, would save some computation.
https://play.golang.org/p/EAba1v8NRcw

This is equivalent: ```go count, _ := bc.stateDiffsProcessed[hash] bc.stateDiffsProcessed[hash] = count + 1 ``` If `!ok`, `count` will have the zero value of int: `0` anyway. I assume this is run very frequently, would save some computation. https://play.golang.org/p/EAba1v8NRcw
@ -1050,1 +1070,4 @@
// since we need the state tries of the current block and its parent in-memory
// in order to process statediffs, we should avoid dereferencing roots until
// its statediff and its child have been processed
m0ar (Migrated from github.com) commented 2019-02-19 12:42:42 +00:00

Why? 🙃 Is it to prevent branches from GC/pruning? Worth the explicit information, that was hard to figure out and harder to guess :)

Why? :upside_down_face: Is it to prevent branches from GC/pruning? Worth the explicit information, that was hard to figure out and harder to guess :)
m0ar (Migrated from github.com) commented 2019-02-19 13:02:31 +00:00

Does this mean we clean up the map when we're finished with a certain diff? Was just about to ask about memory leaks here, but probably okay if that's the case!

Does this mean we clean up the map when we're finished with a certain diff? Was just about to ask about memory leaks here, but probably okay if that's the case!
m0ar (Migrated from github.com) commented 2019-02-19 12:53:19 +00:00

Feels like this constant could be ejected and set somewhere else so we don't have to assign it to a local closure all the time.

Feels like this constant could be ejected and set somewhere else so we don't have to assign it to a local closure all the time.
@ -1486,0 +1503,4 @@
blocks := makeBlockChain(genesis, numberOfBlocks+1, engine, db, canonicalSeed)
_, err := blockchain.InsertChain(blocks)
if err != nil {
t.Fatalf("failed to create pristine chain: %v", err)
m0ar (Migrated from github.com) commented 2019-02-19 12:53:45 +00:00

pristine

:sparkles: _pristine_ :sparkles:
@ -1486,0 +1563,4 @@
}
}
return false
}
m0ar (Migrated from github.com) commented 2019-02-19 13:04:59 +00:00

This diff shows how we should write our tests instead of that damn Ginkgo! Looks so clean, and probably runs way faster. 🐌 Why are we even using gomega assertions when we have good 'ol bool

Could probably be split up in smaller tests, but I figure the setup is a bit annoying.

This diff shows how we should write our tests instead of that damn Ginkgo! Looks so clean, and probably runs way faster. :snail: Why are we even using `gomega` assertions when we have good 'ol `bool` Could probably be split up in smaller tests, but I figure the setup is a bit annoying.
m0ar (Migrated from github.com) commented 2019-02-19 13:06:54 +00:00

😂 🍴

:joy: :fork_and_knife:
m0ar (Migrated from github.com) commented 2019-02-19 13:14:34 +00:00

Minor but this is gonna log weird no?
Error creating tried for newStateRooterrorwhatever error got returned

Minor but this is gonna log weird no? `Error creating tried for newStateRooterrorwhatever error got returned`
m0ar (Migrated from github.com) commented 2019-02-19 13:17:16 +00:00

Dang nvm after checking the docs, that key-value logger is sweet.

Dang nvm after checking the docs, that key-value logger is sweet.
@ -179,2 +175,3 @@
deletedAcc := deletions[common.HexToAddress(val)]
createdAcc := creations[common.HexToHash(val)]
deletedAcc := deletions[common.HexToHash(val)]
oldSR := deletedAcc.Root
m0ar (Migrated from github.com) commented 2019-02-19 13:19:30 +00:00

Why spend efforts duplicating the data when it's just (at a glance?) used to do a map indexing at 134?

Why spend efforts duplicating the data when it's just (at a glance?) used to do a map indexing at 134?
elizabethengelman (Migrated from github.com) reviewed 2019-02-20 15:48:04 +00:00
@ -1050,1 +1070,4 @@
// since we need the state tries of the current block and its parent in-memory
// in order to process statediffs, we should avoid dereferencing roots until
// its statediff and its child have been processed
elizabethengelman (Migrated from github.com) commented 2019-02-20 15:48:04 +00:00

Totally makes sense! Is this more clear?

// since we need the state tries of the current block and its parent in-memory
// in order to process statediffs, we should avoid dereferencing roots until
// its statediff and its child have been processed

Totally makes sense! Is this more clear? > // since we need the state tries of the current block and its parent in-memory // in order to process statediffs, we should avoid dereferencing roots until // its statediff and its child have been processed
elizabethengelman (Migrated from github.com) reviewed 2019-02-20 15:55:33 +00:00
elizabethengelman (Migrated from github.com) commented 2019-02-20 15:55:33 +00:00

Nice catch!

Nice catch!
elizabethengelman (Migrated from github.com) reviewed 2019-02-20 15:57:36 +00:00
elizabethengelman (Migrated from github.com) commented 2019-02-20 15:57:36 +00:00

Yep, if a root is okay to be dereferenced, we removed it from the stateDiffsProcessed collection just to make sure that that collection doesn't get huge. And then it will fall through to triedb.Dereference(root.(common.Hash)) which removes it from the in-memory trie database.

Yep, if a root is okay to be dereferenced, we removed it from the stateDiffsProcessed collection just to make sure that that collection doesn't get huge. And then it will fall through to `triedb.Dereference(root.(common.Hash))` which removes it from the in-memory trie database.
elizabethengelman (Migrated from github.com) reviewed 2019-02-20 16:03:50 +00:00
@ -179,2 +175,3 @@
deletedAcc := deletions[common.HexToAddress(val)]
createdAcc := creations[common.HexToHash(val)]
deletedAcc := deletions[common.HexToHash(val)]
oldSR := deletedAcc.Root
elizabethengelman (Migrated from github.com) commented 2019-02-20 16:03:50 +00:00

The LeafKey comment says that, "callers must not retain references to the value after calling Next" - so making a copy of it may be overkill, since I don't think we need to reference the leaf key after we convert it to a hash (on line 123).

The `LeafKey` comment says that, "callers must not retain references to the value after calling Next" - so making a copy of it may be overkill, since I don't think we need to reference the leaf key after we convert it to a hash (on line 123).
m0ar (Migrated from github.com) reviewed 2019-02-21 10:53:04 +00:00
@ -1050,1 +1070,4 @@
// since we need the state tries of the current block and its parent in-memory
// in order to process statediffs, we should avoid dereferencing roots until
// its statediff and its child have been processed
m0ar (Migrated from github.com) commented 2019-02-21 10:53:04 +00:00

@elizabethengelman Stellarly so! 🌟

@elizabethengelman Stellarly so! :star2:
i-norden reviewed 2019-02-21 19:06:44 +00:00
Member

I would also greatly appreciate a walk through but I was so slow to get back to this that you may have already done one! I don't think this would be what is causing the problem in the service loop, but while looking I noticed the break here and I've got myself confused about named breaks again. I'm under the impression this breaks out of the for loop entirely, not just the current select-case, so it not only skips the current block in the channel but stops listening altogether?

I would also greatly appreciate a walk through but I was so slow to get back to this that you may have already done one! I don't think this would be what is causing the problem in the service loop, but while looking I noticed the break [here](https://github.com/vulcanize/go-ethereum/blob/statediff-for-archive-node/statediff/service/service.go#L89) and I've got myself confused about named breaks again. I'm under the impression this breaks out of the `for` loop entirely, not just the current select-case, so it not only skips the current block in the channel but stops listening altogether?
elizabethengelman (Migrated from github.com) reviewed 2019-02-21 20:18:28 +00:00
elizabethengelman (Migrated from github.com) commented 2019-02-21 20:18:28 +00:00

Just realizing that I forgot to mention that named breaks during the brief walk through. My impression is the same as yours @i-norden, where it would break out of the for loop entirely. My thought for including it in the break that you mentioned was that if a block didn't have a parent, then something wrong, and we should quit the process.

Just realizing that I forgot to mention that named breaks during the brief walk through. My impression is the same as yours @i-norden, where it would break out of the for loop entirely. My thought for including it in the [break that you mentioned](https://github.com/vulcanize/go-ethereum/blob/statediff-for-archive-node/statediff/service/service.go#L89) was that if a block didn't have a parent, then something wrong, and we should quit the process.
i-norden reviewed 2019-02-21 22:24:05 +00:00
Member

That makes sense and I agree we should quit the process. This is really nitpicky but the error kind of makes it sound like we are only skipping that block but otherwise continuing.

That makes sense and I agree we should quit the process. This is really nitpicky but the [error](https://github.com/vulcanize/go-ethereum/blob/272d2f7ef82b8b3729f05e0179e27b01bde30fd3/statediff/service/service.go#L87) kind of makes it sound like we are only skipping that block but otherwise continuing.
elizabethengelman (Migrated from github.com) reviewed 2019-02-22 15:46:52 +00:00
elizabethengelman (Migrated from github.com) commented 2019-02-22 15:46:52 +00:00

That's a great point. I missed this before I merged it into the statediffing branch 🤦🏻‍♀️, but I'll make a note on that PR so that this comment is addressed before we open a PR upstream!

That's a great point. I missed this before I merged it into the `statediffing` branch 🤦🏻‍♀️, but I'll make a note on that PR so that this comment is addressed before we open a PR upstream!
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/go-ethereum#6
No description provided.