Logs stored on disk have minimal information. Contextual information such as block
number, index of log in block, index of transaction in block are filled in upon request.
We can fill in all these fields only having the block header and list of receipts.
But determining the transaction hash of a log requires the block body.
The goal of this PR is postponing this retrieval until we are sure we the transaction hash.
It happens often that the header bloom filter signals there might be matches in a block,
but after actually checking them reveals the logs do not match. We want to avoid fetching
the body in this case.
Note that this changes the semantics of Backend.GetLogs. Downstream callers of
GetLogs now assume log context fields have not been derived, and need to call
DeriveFields on the logs if necessary.
* common, core, eth, les, trie: make prque generic
* les/vflux/server: fixed issues in priorityPool
* common, core, eth, les, trie: make priority also generic in prque
* les/flowcontrol: add test case for priority accumulator overflow
* les/flowcontrol: avoid priority value overflow
* common/prque: use int priority in some tests
No need to convert to int64 when we can just change the type used by the
queue.
* common/prque: remove comment about int64 range
---------
Co-authored-by: Zsolt Felfoldi <zsfelfoldi@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR moves some trie-related db accessor methods to a different file, and also removes the schema type. Instead of the schema type, a string is used to distinguish between hashbased/pathbased db accessors.
This also moves some code from trie package to rawdb package.
This PR is intended to be a no-functionality-change prep PR for #25963 .
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This change implements withdrawals as specified in EIP-4895.
Co-authored-by: lightclient@protonmail.com <lightclient@protonmail.com>
Co-authored-by: marioevz <marioevz@gmail.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR does a few things.
It fixes a shutdown-order flaw in the chainfreezer. Previously, the chain-freezer would shutdown the freezer backend first, and then signal for the loop to exit. This can lead to a scenario where the freezer tries to fsync closed files, which is an error-conditon that could lead to exit via log.Crit.
It also makes the printout more detailed when truncating 'dangling' items, by showing the exact number instead of approximate MB.
This PR also adds calls to fsync files before closing them, and also makes the `db inspect` command slightly more robust.
This PR fixes an issue which might result in data lost in freezer.
Whenever mutation happens in freezer, all data will be written into head data file
and it will be rotated with a new one in case the size of file reaches the threshold.
Theoretically, the rotated old data file should be fsync'd to prevent data loss.
In freezer.Sync function, we only fsync: (1) index file (2) meta file and (3) head
data file. So this PR forcibly fsync the head data file if mutation happens in the
boundary of data file.
This PR implements resettable freezer by adding a ResettableFreezer wrapper.
The resettable freezer wraps the original freezer in a way that makes it possible to ensure atomic resets. Implementation wise, it relies on the os.Rename and os.RemoveAll to atomically delete the original freezer data and re-create a new one from scratch.
This PR drops the legacy receipt types, the freezer-migrate command and the startup check. The previous attempt #22852 at this failed because there were users who still had legacy receipts in their db, so it had to be reverted #23247. Since then we added a command to migrate legacy dbs #24028.
As of the last hardforks all users either must have done the migration, or used the --ignore-legacy-receipts flag which will stop working now.
While investigating #22374, I noticed that the Sync operation of the
freezer does not take the table lock. It also doesn't call sync for all files
if there is an error with one of them. I doubt this will fix anything, but
didn't want to drop the fix on the floor either.
This PR ports a few changes from PBSS:
- Fix the snapshot generator waiter in case the generation is not even initialized
- Refactor db inspector for ancient store
This PR reworks tx indexer a bit. Compared to the original version, one scenario is no longer handled - upgrading from legacy geth without indexer support.
The tx indexer was introduced in 2020 and have been present through hardforks, so it can be assumed that all Geth nodes have tx indexer already. So we can simplify the tx indexer logic a bit:
- If the tail flag is not present, it means node is just initialized may or may not with an ancient store attached. In this case all blocks are regarded as unindexed
- If the tail flag is present, it means blocks below tail are unindexed, blocks above tail are indexed
This change also address some weird cornercases that could make the indexer not work after a crash.
core/blockchain: downgrade tx indexing and unindexing logs from info to debug
If a user has a finite tx lookup limit, they will see an "unindexing" info level log each time a block is imported. This information might help a user understand that they are removing the index each block and some txs may not be retrievable by hash, but overall it is generally more of a nuisance than a benefit. This change downgrades the log to a debug log.
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.
In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.
With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
This enables the following linters
- typecheck
- unused
- staticcheck
- bidichk
- durationcheck
- exportloopref
- gosec
WIth a few exceptions.
- We use a deprecated protobuf in trezor. I didn't want to mess with that, since I cannot meaningfully test any changes there.
- The deprecated TypeMux is used in a few places still, so the warning for it is silenced for now.
- Using string type in context.WithValue is apparently wrong, one should use a custom type, to prevent collisions between different places in the hierarchy of callers. That should be fixed at some point, but may require some attention.
- The warnings for using weak random generator are squashed, since we use a lot of random without need for cryptographic guarantees.
Previously freezer has only been used for storing ancient chain data, while obviously it can be used more. This PR unties the chain data and freezer, keep the minimal freezer structure and move all other logic (like incrementally freezing block data) into a separate structure called ChainFreezer.
This PR also extends the database interface by adding a new ancient store function AncientDatadir which can return the root directory of ancient store. The ancient root directory can be used when we want to open some other ancient-stores (e.g. reverse diff freezer).
This commit replaces ioutil.TempDir with t.TempDir in tests. The
directory created by t.TempDir is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using ioutil.TempDir
had to be removed manually by calling os.RemoveAll, which is omitted in
some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but t.TempDir handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* cmd,core: add simple legacy receipt converter
core/rawdb: use forEach in migrate
core/rawdb: batch reads in forEach
core/rawdb: make forEach anonymous fn
cmd/geth: check for legacy receipts on node startup
fix err msg
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
fix log
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
fix some review comments
add warning to cmd
drop isLegacy fn from migrateTable params
add test for windows rename
test replacing in windows case
* minor fix
* sanity check for tail-deletion
* add log before moving files around
* speed-up hack for mainnet
* fix mainnet check, use networkid instead
* check mainnet genesis
* review fixes
* resume previous migration attempt
* core/rawdb: lint fix
Co-authored-by: Martin Holst Swende <martin@swende.se>
* eth/downloader: implement beacon sync
* eth/downloader: fix a crash if the beacon chain is reduced in length
* eth/downloader: fix beacon sync start/stop thrashing data race
* eth/downloader: use a non-nil pivot even in degenerate sync requests
* eth/downloader: don't touch internal state on beacon Head retrieval
* eth/downloader: fix spelling mistakes
* eth/downloader: fix some typos
* eth: integrate legacy/beacon sync switchover and UX
* eth: handle UX wise being stuck on post-merge TTD
* core, eth: integrate the beacon client with the beacon sync
* eth/catalyst: make some warning messages nicer
* eth/downloader: remove Ethereum 1&2 notions in favor of merge
* core/beacon, eth: clean up engine API returns a bit
* eth/downloader: add skeleton extension tests
* eth/catalyst: keep non-kiln spec, handle mining on ttd
* eth/downloader: add beacon header retrieval tests
* eth: fixed spelling, commented failing tests out
* eth/downloader: review fixes
* eth/downloader: drop peers failing to deliver beacon headers
* core/rawdb: track beacon sync data in db inspect
* eth: fix review concerns
* internal/web3ext: nit
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
* core/rawdb, cmd, ethdb, eth: implement freezer tail deletion
* core/rawdb: address comments from martin and sina
* core/rawdb: fixes cornercase in tail deletion
* core/rawdb: separate metadata into a standalone file
* core/rawdb: remove unused code
* core/rawdb: add random test
* core/rawdb: polish code
* core/rawdb: fsync meta file before manipulating the index
* core/rawdb: fix typo
* core/rawdb: address comments
This PR adds an addtional API called `NewBatchWithSize` for db
batcher. It turns out that leveldb batch memory allocation is
super inefficient. The main reason is the allocation step of
leveldb Batch is too small when the batch size is large. It can
take a few second to build a leveldb batch with 100MB size.
Luckily, leveldb also offers another API called MakeBatch which can
pre-allocate the memory area. So if the approximate size of batch is
known in advance, this API can be used in this case.
It's needed in new state scheme PR which needs to commit a batch of
trie nodes in a single batch. Implement the feature in a seperate PR.
* freezer: add readonly flag to table
* freezer: enforce readonly in table repair
* freezer: enforce readonly in newFreezer
* minor fix
* minor
* core/rawdb: test that writing during readonly fails
* rm unused log
* check readonly on batch append
* minor
* Revert "check readonly on batch append"
This reverts commit 2ddb5ec4ba7534bf6edbdfec158ea99a2eed5036.
* review fixes
* minor test refactor
* attempt at fixing windows issue
* add comment re windows sync issue
* k->kind
* open readonly db for genesis check
Co-authored-by: Martin Holst Swende <martin@swende.se>
The loop variable changes before the defer executes, so we need to
pass it to the function to make sure the closure has the correct
version of the loop variable.
It turns out we were only notifying plugins of freezer commits and
cleaning up our record when the block being processed was greater
than MAX_UINT64.
So, uh, never.
This PR reduces the amount of work we do when answering header queries, e.g. when a peer
is syncing from us.
For some items, e.g block bodies, when we read the rlp-data from database, we plug it
directly into the response package. We didn't do that for headers, but instead read
headers-rlp, decode to types.Header, and re-encode to rlp. This PR changes that to keep it
in RLP-form as much as possible. When a node is syncing from us, it typically requests 192
contiguous headers. On master it has the following effect:
- For headers not in ancient: 2 db lookups. One for translating hash->number (even though
the request is by number), and another for reading by hash (this latter one is sometimes
cached).
- For headers in ancient: 1 file lookup/syscall for translating hash->number (even though
the request is by number), and another for reading the header itself. After this, it
also performes a hashing of the header, to ensure that the hash is what it expected. In
this PR, I instead move the logic for "give me a sequence of blocks" into the lower
layers, where the database can determine how and what to read from leveldb and/or
ancients.
There are basically four types of requests; three of them are improved this way. The
fourth, by hash going backwards, is more tricky to optimize. However, since we know that
the gap is 0, we can look up by the parentHash, and stlil shave off all the number->hash
lookups.
The gapped collection can be optimized similarly, as a follow-up, at least in three out of
four cases.
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR fixes a special corner case in transaction indexing.
When the chain is rewound by SetHead to a historical point which is even lower than the transaction indexes tail, then system will report Failed to decode block body error all the time, because the relevant blocks are already deleted.
In order to avoid this "non-critical-but-annoying" issue, we can recap the indexing target to head+1(to is excluded, so it means indexing transactions from 0 to head).
* all: work for eth1/2 transtition
* consensus/beacon, eth: change beacon difficulty to 0
* eth: updates
* all: add terminalBlockDifficulty config, fix rebasing issues
* eth: implemented merge interop spec
* internal/ethapi: update to v1.0.0.alpha.2
This commit updates the code to the new spec, moving payloadId into
it's own object. It also fixes an issue with finalizing an empty blockhash.
It also properly sets the basefee
* all: sync polishes, other fixes + refactors
* core, eth: correct semantics for LeavePoW, EnterPoS
* core: fixed rebasing artifacts
* core: light: performance improvements
* core: use keyed field (f)
* core: eth: fix compilation issues + tests
* eth/catalyst: dbetter error codes
* all: move Merger to consensus/, remove reliance on it in bc
* all: renamed EnterPoS and LeavePoW to ReachTDD and FinalizePoS
* core: make mergelogs a function
* core: use InsertChain instead of InsertBlock
* les: drop merger from lightchain object
* consensus: add merger
* core: recoverAncestors in catalyst mode
* core: fix nitpick
* all: removed merger from beacon, use TTD, nitpicks
* consensus: eth: add docstring, removed unnecessary code duplication
* consensus/beacon: better comment
* all: easy to fix nitpicks by karalabe
* consensus/beacon: verify known headers to be sure
* core: comments
* core: eth: don't drop peers who advertise blocks, nitpicks
* core: never add beacon blocks to the future queue
* core: fixed nitpicks
* consensus/beacon: simplify IsTTDReached check
* consensus/beacon: correct IsTTDReached check
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
This PR offers two more database sub commands for exporting and importing data.
Two exporters are implemented: preimage and snapshot data respectively.
The import command is generic, it can take any data export and import into leveldb.
The data format has a 'magic' for disambiguation, and a version field for future compatibility.
We had been assuming that the `item` returned from batch.commit()
was the item committed, but it's actually the next item to be added
to the freezer, and multiple items can be committed in a single batch.
This commit finds the smallest item in the freezer and iterates from
that to the number returned by commit(), passing any tracked blocks
in that range to plugins.
This PR adds a new accessor method to the freezer database. This new view offers a consistent interface, guaranteeing that all individual tables (headers, bodies etc) are all on the same number, and that this number is not changes (added/truncated) while the operation is performing.
This commit adds a ModifyAncients hook that plugins can implement
to more accurately track what Geth is doing under the hood. We
still support the old AppendAncients interface as best we can,
though internal changes may make it so that it does not behave
as it once did.
Notes: the AppendAncient plugin hook is broken by this commit.
This adds CaptureEnter() and CaptureExit() as no-ops for interface
compliance, but these capabilities should be added for plugin tracers
soon.
* core/types: rm extranous check in test
* core/rawdb: add lightweight types for block logs
* core/rawdb,eth: use lightweight accessor for log filtering
* core/rawdb: add bench for decoding into rlpLogs
This change is a rewrite of the freezer code.
When writing ancient chain data to the freezer, the previous version first encoded each
individual item to a temporary buffer, then wrote the buffer. For small item sizes (for
example, in the block hash freezer table), this strategy causes a lot of system calls for
writing tiny chunks of data. It also allocated a lot of temporary []byte buffers.
In the new version, we instead encode multiple items into a re-useable batch buffer, which
is then written to the file all at once. This avoids performing a system call for every
inserted item.
To make the internal batching work, the ancient database API had to be changed. While
integrating this new API in BlockChain.InsertReceiptChain, additional optimizations were
also added there.
Co-authored-by: Felix Lange <fjl@twurst.com>
* core/rawdb: implement sequential reads in freezer_table
* core/rawdb, ethdb: add sequential reader to db interface
* core/rawdb: lint nitpicks
* core/rawdb: fix some nitpicks
* core/rawdb: fix flaw with deferred reads not being performed
* core/rawdb: better documentation
* eth/protocols/snap: generate storage trie from full dirty snap data
* eth/protocols/snap: get rid of some more dead code
* eth/protocols/snap: less frequent logs, also log during trie generation
* eth/protocols/snap: implement dirty account range stack-hashing
* eth/protocols/snap: don't loop on account trie generation
* eth/protocols/snap: fix account format in trie
* core, eth, ethdb: glue snap packets together, but not chunks
* eth/protocols/snap: print completion log for snap phase
* eth/protocols/snap: extended tests
* eth/protocols/snap: make testcase pass
* eth/protocols/snap: fix account stacktrie commit without defer
* ethdb: fix key counts on reset
* eth/protocols: fix typos
* eth/protocols/snap: make better use of delivered data (#44)
* eth/protocols/snap: make better use of delivered data
* squashme
* eth/protocols/snap: reduce chunking
* squashme
* eth/protocols/snap: reduce chunking further
* eth/protocols/snap: break out hash range calculations
* eth/protocols/snap: use sort.Search instead of looping
* eth/protocols/snap: prevent crash on storage response with no keys
* eth/protocols/snap: nitpicks all around
* eth/protocols/snap: clear heal need on 1-chunk storage completion
* eth/protocols/snap: fix range chunker, add tests
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
* trie: fix test API error
* eth/protocols/snap: fix some further liter issues
* eth/protocols/snap: fix accidental batch reuse
Co-authored-by: Martin Holst Swende <martin@swende.se>
The Append / truncate operations were racy. When a datafile reaches 2Gb, a new file is needed. For this operation, we require a writelock, which is not needed in the 99.99% of all cases where the data does fit in the current head-file.
This transition from readlock to writelock was incorrect, and as the readlock was released, a truncate operation could slip in between, and truncate the data. This would have been fine, however, the Append operation continued writing as if no truncation had occurred, e.g writing item 5 where item 0 should reside.
This PR changes the behaviour, so that if when we run into the situation that a new file is needed, it aborts, and retries, this time with a writelock.
The outcome of the situation described above, running on this PR, would instead be that the Append operation exits with a failure.