* all: implement path-based state scheme
* all: edits from review
* core/rawdb, trie/triedb/pathdb: review changes
* core, light, trie, eth, tests: reimplement pbss history
* core, trie/triedb/pathdb: track block number in state history
* trie/triedb/pathdb: add history documentation
* core, trie/triedb/pathdb: address comments from Peter's review
Important changes to list:
- Cache trie nodes by path in clean cache
- Remove root->id mappings when history is truncated
* trie/triedb/pathdb: fallback to disk if unexpect node in clean cache
* core/rawdb: fix tests
* trie/triedb/pathdb: rename metrics, change clean cache key
* trie/triedb: manage the clean cache inside of disk layer
* trie/triedb/pathdb: move journal function
* trie/triedb/path: fix tests
* trie/triedb/pathdb: fix journal
* trie/triedb/pathdb: fix history
* trie/triedb/pathdb: try to fix tests on windows
* core, trie: address comments
* trie/triedb/pathdb: fix test issues
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
The state availability is checked during the creation of a state reader.
- In hash-based database, if the specified root node does not exist on disk disk, then
the state reader won't be created and an error will be returned.
- In path-based database, if the specified state layer is not available, then the
state reader won't be created and an error will be returned.
This change also contains a stricter semantics regarding the `Commit` operation: once it has been performed, the trie is no longer usable, and certain operations will return an error.
* trie: add node type common package
In trie/types package, a few node wrappers are defined, which will be used
in both trie package, trie/snap package, etc. Therefore, a standalone common
package is created to put these stuffs.
* trie: rename trie/types to trie/trienode
In this PR, all TryXXX(e.g. TryGet) APIs of trie are renamed to XXX(e.g. Get) with an error returned.
The original XXX(e.g. Get) APIs are renamed to MustXXX(e.g. MustGet) and does not return any error -- they print a log output. A future PR will change the behaviour to panic on errorrs.
The EmptyRootHash and EmptyCodeHash are defined everywhere in the codebase, this PR replaces all of them with unified one defined in core/types package, and also defines constants for TxRoot, WithdrawalsRoot and UncleRoot
This PR contains a small portion of the full pbss PR, namely
Remove the tracer from trie (and comitter), and instead using an accessList.
Related changes to the Nodeset.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR fixes an error in trie commit. If the trie.root is nil, it can be two possible scenarios:
- The trie was empty, and no change happens
- The trie was non-empty and all nodes are dropped
For the latter one, we should collect the deletions and apply them into database(e.g. in PBSS).
This PR introduces a node scheme abstraction. The interface is only implemented by `hashScheme` at the moment, but will be extended by `pathScheme` very soon.
Apart from that, a few changes are also included which is worth mentioning:
- port the changes in the stacktrie, tracking the path prefix of nodes during commit
- use ethdb.Database for constructing trie.Database. This is not necessary right now, but it is required for path-based used to open reverse diff freezer
This PR includes minor updates to comments in trie/committer that reference insertion to the db, and adds an err != nil check for the return value of preimages.commit.
* core: use TryGetAccount to read where TryUpdateAccount has been used to write
* Gary's review feedback
* implement Gary's suggestion
* fix bug + rename NewSecure into NewStateTrie
* trie: add backwards-compatibility aliases for SecureTrie
* Update database.go
* make the linter happy
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
Trie tracer is an auxiliary tool to capture all deleted nodes
which can't be captured by trie.Committer. The deleted nodes
can be removed from the disk later.
This functionality is needed in new path-based storage scheme, but
can be implemented in a seperate PR though.
When an account is deleted, then all the storage slots should be
nuked out from the disk as well. In hash-based storage scheme they
are still left in the disk but in new scheme, they will be iterated
and marked as deleted.
But why the NodeBlob API is needed in this scenario? Because when
the node is marked deleted, the previous value is also required to
be recorded to construct the reverse diff.
When deleting in fullNode, and the new child node nn is not nil, there is no need
to check the number of non-nil entries in the node. This is because the fullNode
must've contained at least two children before deletion, so there must be another
child node other than nn.
Co-authored-by: Felix Lange <fjl@twurst.com>
This commit splits the eth package, separating the handling of eth and snap protocols. It also includes the capability to run snap sync (https://github.com/ethereum/devp2p/blob/master/caps/snap.md) , but does not enable it by default.
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Co-authored-by: Martin Holst Swende <martin@swende.se>
* trie: update tests to check commit integrity
* trie: polish committer
* trie: fix typo
* trie: remove hasvalue notion
According to the benchmarks, type assertion between the pointer and
interface is extremely fast.
BenchmarkIntmethod-12 1000000000 1.91 ns/op
BenchmarkInterface-12 1000000000 2.13 ns/op
BenchmarkTypeSwitch-12 1000000000 1.81 ns/op
BenchmarkTypeAssertion-12 2000000000 1.78 ns/op
So the overhead for asserting whether the shortnode has "valuenode"
child is super tiny. No necessary to have another field.
* trie: linter nitpicks
Co-authored-by: Martin Holst Swende <martin@swende.se>
The current trie memory database/cache that we do pruning on stores
trie nodes as binary rlp encoded blobs, and also stores the node
relationships/references for GC purposes. However, most of the trie
nodes (everything apart from a value node) is in essence just a
collection of references.
This PR switches out the RLP encoded trie blobs with the
collapsed-but-not-serialized trie nodes. This permits most of the
references to be recovered from within the node data structure,
avoiding the need to track them a second time (expensive memory wise).
* trie: make fullnode children hash calculation concurrently
* trie: thread out only on topmost fullnode
* trie: clean up full node children hash calculation
* trie: minor code fixups
* ethdb: add Putter interface and Has method
* ethdb: improve docs and add IdealBatchSize
* ethdb: remove memory batch lock
Batches are not safe for concurrent use.
* core: use ethdb.Putter for Write* functions
This covers the easy cases.
* core/state: simplify StateSync
* trie: optimize local node check
* ethdb: add ValueSize to Batch
* core: optimize HasHeader check
This avoids one random database read get the block number. For many uses
of HasHeader, the expectation is that it's actually there. Using Has
avoids a load + decode of the value.
* core: write fast sync block data in batches
Collect writes into batches up to the ideal size instead of issuing many
small, concurrent writes.
* eth/downloader: commit larger state batches
Collect nodes into a batch up to the ideal size instead of committing
whenever a node is received.
* core: optimize HasBlock check
This avoids a random database read to get the number.
* core: use numberCache in HasHeader
numberCache has higher capacity, increasing the odds of finding the
header without a database lookup.
* core: write imported block data using a batch
Restore batch writes of state and add blocks, tx entries, receipts to
the same batch. The change also simplifies the miner.
This commit also removes posting of logs when a forked block is imported.
* core: fix DB write error handling
* ethdb: use RLock for Has
* core: fix HasBlock comment
* ethdb: remove Set
Set deadlocks immediately and isn't part of the Database interface.
* trie: add Err to Iterator
This is useful for testing because the underlying NodeIterator doesn't
need to be kept in a separate variable just to get the error.
* trie: add LeafKey to iterator, panic when not at leaf
LeafKey is useful for callers that can't interpret Path.
* trie: retry failed seek/peek in iterator Next
Instead of failing iteration irrecoverably, make it so Next retries the
pending seek or peek every time.
Smaller changes in this commit make this easier to test:
* The iterator previously returned from Next on encountering a hash
node. This caused it to visit the same path twice.
* Path returned nibbles with terminator symbol for valueNode attached
to fullNode, but removed it for valueNode attached to shortNode. Now
the terminator is always present. This makes Path unique to each node
and simplifies Leaf.
* trie: add Path to MissingNodeError
The light client trie iterator needs to know the path of the node that's
missing so it can retrieve a proof for it. NodeIterator.Path is not
sufficient because it is updated when the node is resolved and actually
visited by the iterator.
Also remove unused fields. They were added a long time ago before we
knew which fields would be needed for the light client.
The 'step' method is split into two parts, 'peek' and 'push'. peek
returns the next state but doesn't make it current.
The end of iteration was previously tracked by setting 'trie' to nil.
End of iteration is now tracked using the 'iteratorEnd' error, which is
slightly cleaner and requires less code.