This fixes a database corruption issue that could occur during state healing.
When sync is aborted while certain modifications were already committed, and a
reorg occurs, the database would contain incorrect trie nodes stored by path.
These nodes need to detected/deleted in order to obtain a complete and fully correct state
after state healing.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
* trie: use pooling of iterator states in iterator
The node iterator burns through a lot of memory while iterating a trie, and a lot of
that can be avoided by using a fairly small pool (max 40 items).
name old time/op new time/op delta
Iterator-8 6.22ms ± 3% 5.40ms ± 6% -13.18% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Iterator-8 2.36MB ± 0% 1.67MB ± 0% -29.23% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Iterator-8 37.0k ± 0% 29.8k ± 0% ~ (p=0.079 n=4+5)
* ethdb/memorydb: avoid one copying of key
By making the transformation from []byte to string at an earlier point,
we save an allocation which otherwise happens later on.
name old time/op new time/op delta
BatchAllocs-8 412µs ± 6% 382µs ± 2% -7.18% (p=0.016 n=5+4)
name old alloc/op new alloc/op delta
BatchAllocs-8 480kB ± 0% 490kB ± 0% +1.93% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
BatchAllocs-8 3.03k ± 0% 2.03k ± 0% -32.98% (p=0.008 n=5+5)
cockroachdb/pebble@422dce9 added Errorf to the Logger interface, this change makes it possible to compile geth with that version of pebble by adding the corresponding method to panicLogger.
The Go authors updated golang/x/ext to change the function signature of the slices sort method.
It's an entire shitshow now because x/ext is not tagged, so everyone's codebase just
picked a new version that some other dep depends on, causing our code to fail building.
This PR updates the dep on our code too and does all the refactorings to follow upstream...
* all: implement path-based state scheme
* all: edits from review
* core/rawdb, trie/triedb/pathdb: review changes
* core, light, trie, eth, tests: reimplement pbss history
* core, trie/triedb/pathdb: track block number in state history
* trie/triedb/pathdb: add history documentation
* core, trie/triedb/pathdb: address comments from Peter's review
Important changes to list:
- Cache trie nodes by path in clean cache
- Remove root->id mappings when history is truncated
* trie/triedb/pathdb: fallback to disk if unexpect node in clean cache
* core/rawdb: fix tests
* trie/triedb/pathdb: rename metrics, change clean cache key
* trie/triedb: manage the clean cache inside of disk layer
* trie/triedb/pathdb: move journal function
* trie/triedb/path: fix tests
* trie/triedb/pathdb: fix journal
* trie/triedb/pathdb: fix history
* trie/triedb/pathdb: try to fix tests on windows
* core, trie: address comments
* trie/triedb/pathdb: fix test issues
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
This change adds the ability to perform reads from freezer without size limitation. This can be useful in cases where callers are certain that out-of-memory will not happen (e.g. reading only a few elements).
The previous API was designed to behave both optimally and secure while servicing a request from a peer, whereas this change should _not_ be used when an untrusted peer can influence the query size.
This removes text parsing in leveldb metrics collection code. All metrics
can now be accessed through the stats API provided by leveldb.
We also add new gauge-typed metrics that count the number of tables at each level.
---------
Co-authored-by: Exca-DK <Exca-DK@users.noreply.github.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This is likely the culprit behind several data corruption issues, e.g. where data has been
written to the freezer, but the deletion from pebble does not go through due to process
crash.
One difference between pebble and leveldb is that the latter returns error when performing Get on a closed database, the former does a panic. This may be triggered during shutdown (see #27237)
This PR changes the pebble driver so we check that the db is not closed already, for several operations. It also adds tests to the db test-suite, so the previously implicit assumption of "not panic:ing at ops on closed database" is covered by tests.
* cmd/utils, node: switch to Pebble as the default db if none exists
* node: fall back to LevelDB on platforms not supporting Pebble
* core/rawdb, node: default to Pebble at the node level
* cmd/geth: fix some tests explicitly using leveldb
* ethdb/pebble: allow double closes, makes tests simpler
This PR is a (superior) alternative to https://github.com/ethereum/go-ethereum/pull/26708, it handles deprecation, primarily two specific cases.
`rand.Seed` is typically used in two ways
- `rand.Seed(time.Now().UnixNano())` -- we seed it, just to be sure to get some random, and not always get the same thing on every run. This is not needed, with global seeding, so those are just removed.
- `rand.Seed(1)` this is typically done to ensure we have a stable test. If we rely on this, we need to fix up the tests to use a deterministic prng-source. A few occurrences like this has been replaced with a proper custom source.
`rand.Read` has been replaced by `crypto/rand`.`Read` in this PR.
This PR makes it possible to set custom headers, in particular for two scenarios:
- geth attach
- geth commands which can use --remotedb, e..g geth db inspect
The ability to use custom headers is typically useful for connecting to cloud-apis, e.g. providing an infura- or alchemy key, or for that matter access-keys for environments behind cloudflare.
Co-authored-by: Felix Lange <fjl@twurst.com>
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.
In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.
With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
* ethdb/remotedb, cmd: add support for remote (readonly) databases
* ethdb/remotedb: minor changes
* ethdb/remotedb: close the conn
* cmd, ethdb: add rpc accessor for ancient data
* internal/ethapi: license
* ethdb/remotedb: linter fixes
Previously freezer has only been used for storing ancient chain data, while obviously it can be used more. This PR unties the chain data and freezer, keep the minimal freezer structure and move all other logic (like incrementally freezing block data) into a separate structure called ChainFreezer.
This PR also extends the database interface by adding a new ancient store function AncientDatadir which can return the root directory of ancient store. The ancient root directory can be used when we want to open some other ancient-stores (e.g. reverse diff freezer).
* cmd,core: add simple legacy receipt converter
core/rawdb: use forEach in migrate
core/rawdb: batch reads in forEach
core/rawdb: make forEach anonymous fn
cmd/geth: check for legacy receipts on node startup
fix err msg
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
fix log
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
fix some review comments
add warning to cmd
drop isLegacy fn from migrateTable params
add test for windows rename
test replacing in windows case
* minor fix
* sanity check for tail-deletion
* add log before moving files around
* speed-up hack for mainnet
* fix mainnet check, use networkid instead
* check mainnet genesis
* review fixes
* resume previous migration attempt
* core/rawdb: lint fix
Co-authored-by: Martin Holst Swende <martin@swende.se>
* core/rawdb, cmd, ethdb, eth: implement freezer tail deletion
* core/rawdb: address comments from martin and sina
* core/rawdb: fixes cornercase in tail deletion
* core/rawdb: separate metadata into a standalone file
* core/rawdb: remove unused code
* core/rawdb: add random test
* core/rawdb: polish code
* core/rawdb: fsync meta file before manipulating the index
* core/rawdb: fix typo
* core/rawdb: address comments
This PR adds an addtional API called `NewBatchWithSize` for db
batcher. It turns out that leveldb batch memory allocation is
super inefficient. The main reason is the allocation step of
leveldb Batch is too small when the batch size is large. It can
take a few second to build a leveldb batch with 100MB size.
Luckily, leveldb also offers another API called MakeBatch which can
pre-allocate the memory area. So if the approximate size of batch is
known in advance, this API can be used in this case.
It's needed in new state scheme PR which needs to commit a batch of
trie nodes in a single batch. Implement the feature in a seperate PR.
This PR adds a new accessor method to the freezer database. This new view offers a consistent interface, guaranteeing that all individual tables (headers, bodies etc) are all on the same number, and that this number is not changes (added/truncated) while the operation is performing.
This change is a rewrite of the freezer code.
When writing ancient chain data to the freezer, the previous version first encoded each
individual item to a temporary buffer, then wrote the buffer. For small item sizes (for
example, in the block hash freezer table), this strategy causes a lot of system calls for
writing tiny chunks of data. It also allocated a lot of temporary []byte buffers.
In the new version, we instead encode multiple items into a re-useable batch buffer, which
is then written to the file all at once. This avoids performing a system call for every
inserted item.
To make the internal batching work, the ancient database API had to be changed. While
integrating this new API in BlockChain.InsertReceiptChain, additional optimizations were
also added there.
Co-authored-by: Felix Lange <fjl@twurst.com>