This PR reduces the amount of work we do when answering header queries, e.g. when a peer
is syncing from us.
For some items, e.g block bodies, when we read the rlp-data from database, we plug it
directly into the response package. We didn't do that for headers, but instead read
headers-rlp, decode to types.Header, and re-encode to rlp. This PR changes that to keep it
in RLP-form as much as possible. When a node is syncing from us, it typically requests 192
contiguous headers. On master it has the following effect:
- For headers not in ancient: 2 db lookups. One for translating hash->number (even though
the request is by number), and another for reading by hash (this latter one is sometimes
cached).
- For headers in ancient: 1 file lookup/syscall for translating hash->number (even though
the request is by number), and another for reading the header itself. After this, it
also performes a hashing of the header, to ensure that the hash is what it expected. In
this PR, I instead move the logic for "give me a sequence of blocks" into the lower
layers, where the database can determine how and what to read from leveldb and/or
ancients.
There are basically four types of requests; three of them are improved this way. The
fourth, by hash going backwards, is more tricky to optimize. However, since we know that
the gap is 0, we can look up by the parentHash, and stlil shave off all the number->hash
lookups.
The gapped collection can be optimized similarly, as a follow-up, at least in three out of
four cases.
Co-authored-by: Felix Lange <fjl@twurst.com>
* all: work for eth1/2 transtition
* consensus/beacon, eth: change beacon difficulty to 0
* eth: updates
* all: add terminalBlockDifficulty config, fix rebasing issues
* eth: implemented merge interop spec
* internal/ethapi: update to v1.0.0.alpha.2
This commit updates the code to the new spec, moving payloadId into
it's own object. It also fixes an issue with finalizing an empty blockhash.
It also properly sets the basefee
* all: sync polishes, other fixes + refactors
* core, eth: correct semantics for LeavePoW, EnterPoS
* core: fixed rebasing artifacts
* core: light: performance improvements
* core: use keyed field (f)
* core: eth: fix compilation issues + tests
* eth/catalyst: dbetter error codes
* all: move Merger to consensus/, remove reliance on it in bc
* all: renamed EnterPoS and LeavePoW to ReachTDD and FinalizePoS
* core: make mergelogs a function
* core: use InsertChain instead of InsertBlock
* les: drop merger from lightchain object
* consensus: add merger
* core: recoverAncestors in catalyst mode
* core: fix nitpick
* all: removed merger from beacon, use TTD, nitpicks
* consensus: eth: add docstring, removed unnecessary code duplication
* consensus/beacon: better comment
* all: easy to fix nitpicks by karalabe
* consensus/beacon: verify known headers to be sure
* core: comments
* core: eth: don't drop peers who advertise blocks, nitpicks
* core: never add beacon blocks to the future queue
* core: fixed nitpicks
* consensus/beacon: simplify IsTTDReached check
* consensus/beacon: correct IsTTDReached check
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
This removes some code:
- The clique engine calculated the snapshot twice when verifying headers/blocks.
- The method GetBlockHashesFromHash in Header/Block/Lightchain was only used by tests. It
is now removed from the API.
- The method GetTdByHash internally looked up the number before calling GetTd(hash, num).
In many cases, callers already had the number, and used this method just because it has a
shorter name. I have removed the method to make the API surface smaller.
This change increases the cache size from 64 to 256 Mb for block bodies.
Benchmarks have shown this to be one bottleneck when trying to achieve
higher download speeds.
The commit also includes a minor optimization for header inserts in package
core: previously, the presence of headers in the database was checked for
every header before writing it. With the change, if one header fails the
presence check, all subsequent headers are also assumed to be missing.
This is an improvement because in practice, the headers are almost always
missing during sync.
* all: add thousandths separators for big numbers on log messages
* p2p/sentry: drop accidental file
* common, log: add fast number formatter
* common, eth/protocols/snap: simplifty fancy num types
* log: handle nil big ints
This PR implements the following modifications
- Don't shortcut check if block is present, thus avoid disk lookup
- Don't check hash ancestry in early-check (it's still done in parallel checker)
- Don't check time.Now for every single header
Charts and background info can be found here: https://github.com/holiman/headerimport/blob/main/README.md
With these changes, writing 1M headers goes down to from 80s to 62s.
* core: add test for headerchain inserts
* core, light: write headerchains in batches
* core: change to one callback per batch of inserted headers + review concerns
* core: error-check on batch write
* core: unexport writeHeaders
* core: remove callback parameter in InsertHeaderChain
The semantics of InsertHeaderChain are now much simpler: it is now an
all-or-nothing operation. The new WriteStatus return value allows
callers to check for the canonicality of the insertion. This change
simplifies use of HeaderChain in package les, where the callback was
previously used to post chain events.
* core: skip some hashing when writing headers
* core: less hashing in header validation
* core: fix headerchain flaw regarding blacklisted hashes
Co-authored-by: Felix Lange <fjl@twurst.com>
* core: fix the condition of reorg
* core: fix nitpick to only retrieve head once
* core: don't reorg if received chain is longer at same diff
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
* core, eth: some fixes for freezer
* vendor, core/rawdb, cmd/geth: add db inspector
* core, cmd/utils: check ancient store path forceily
* cmd/geth, common, core/rawdb: a few fixes
* cmd/geth: support windows file rename and fix rename error
* core: support ancient plugin
* core, cmd: streaming file copy
* cmd, consensus, core, tests: keep genesis in leveldb
* core: write txlookup during ancient init
* core: bump database version
* all: freezer style syncing
core, eth, les, light: clean up freezer relative APIs
core, eth, les, trie, ethdb, light: clean a bit
core, eth, les, light: add unit tests
core, light: rewrite setHead function
core, eth: fix downloader unit tests
core: add receipt chain insertion test
core: use constant instead of hardcoding table name
core: fix rollback
core: fix setHead
core/rawdb: remove canonical block first and then iterate side chain
core/rawdb, ethdb: add hasAncient interface
eth/downloader: calculate ancient limit via cht first
core, eth, ethdb: lots of fixes
* eth/downloader: print ancient disable log only for fast sync
This PR is a more advanced form of the dirty-to-clean cacher (#18995),
where we reuse previous database write batches as datasets to uncache,
saving a dirty-trie-iteration and a dirty-trie-rlp-reencoding per block.
* ethdb: add Putter interface and Has method
* ethdb: improve docs and add IdealBatchSize
* ethdb: remove memory batch lock
Batches are not safe for concurrent use.
* core: use ethdb.Putter for Write* functions
This covers the easy cases.
* core/state: simplify StateSync
* trie: optimize local node check
* ethdb: add ValueSize to Batch
* core: optimize HasHeader check
This avoids one random database read get the block number. For many uses
of HasHeader, the expectation is that it's actually there. Using Has
avoids a load + decode of the value.
* core: write fast sync block data in batches
Collect writes into batches up to the ideal size instead of issuing many
small, concurrent writes.
* eth/downloader: commit larger state batches
Collect nodes into a batch up to the ideal size instead of committing
whenever a node is received.
* core: optimize HasBlock check
This avoids a random database read to get the number.
* core: use numberCache in HasHeader
numberCache has higher capacity, increasing the odds of finding the
header without a database lookup.
* core: write imported block data using a batch
Restore batch writes of state and add blocks, tx entries, receipts to
the same batch. The change also simplifies the miner.
This commit also removes posting of logs when a forked block is imported.
* core: fix DB write error handling
* ethdb: use RLock for Has
* core: fix HasBlock comment
This commit adds pluggable consensus engines to go-ethereum. In short, it
introduces a generic consensus interface, and refactors the entire codebase to
use this interface.
This commit solves several issues concerning the genesis block:
* Genesis/ChainConfig loading was handled by cmd/geth code. This left
library users in the cold. They could specify a JSON-encoded
string and overwrite the config, but didn't get any of the additional
checks performed by geth.
* Decoding and writing of genesis JSON was conflated in
WriteGenesisBlock. This made it a lot harder to embed the genesis
block into the forthcoming config file loader. This commit changes
things so there is a single Genesis type that represents genesis
blocks. All uses of Write*Genesis* are changed to use the new type
instead.
* If the chain config supplied by the user was incompatible with the
current chain (i.e. the chain had already advanced beyond a scheduled
fork), it got overwritten. This is not an issue in practice because
previous forks have always had the highest total difficulty. It might
matter in the future though. The new code reverts the local chain to
the point of the fork when upgrading configuration.
The change to genesis block data removes compression library
dependencies from package core.
* common/math: optimize PaddedBigBytes, use it more
name old time/op new time/op delta
PaddedBigBytes-8 71.1ns ± 5% 46.1ns ± 1% -35.15% (p=0.000 n=20+19)
name old alloc/op new alloc/op delta
PaddedBigBytes-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=20+20)
* all: unify big.Int zero checks
Various checks were in use. This commit replaces them all with Int.Sign,
which is cheaper and less code.
eg templates:
func before(x *big.Int) bool { return x.BitLen() == 0 }
func after(x *big.Int) bool { return x.Sign() == 0 }
func before(x *big.Int) bool { return x.BitLen() > 0 }
func after(x *big.Int) bool { return x.Sign() != 0 }
func before(x *big.Int) int { return x.Cmp(common.Big0) }
func after(x *big.Int) int { return x.Sign() }
* common/math, crypto/secp256k1: make ReadBits public in package math
This commit implements EIP158 part 1, 2, 3 & 4
1. If an account is empty it's no longer written to the trie. An empty
account is defined as (balance=0, nonce=0, storage=0, code=0).
2. Delete an empty account if it's touched
3. An empty account is redefined as either non-existent or empty.
4. Zero value calls and zero value suicides no longer consume the 25k
reation costs.
params: moved core/config to params
Signed-off-by: Jeffrey Wilcke <jeffrey@ethereum.org>
Added chain configuration options and write out during genesis database
insertion. If no "config" was found, nothing is written to the database.
Configurations are written on a per genesis base. This means
that any chain (which is identified by it's genesis hash) can have their
own chain settings.