## Issue Addressed
#3212
## Proposed Changes
- Introduce a new `rate_limiting_backfill_queue` - any new inbound backfill work events gets immediately sent to this FIFO queue **without any processing**
- Spawn a `backfill_scheduler` routine that pops a backfill event from the FIFO queue at specified intervals (currently halfway through a slot, or at 6s after slot start for 12s slots) and sends the event to `BeaconProcessor` via a `scheduled_backfill_work_tx` channel
- This channel gets polled last in the `InboundEvents`, and work event received is wrapped in a `InboundEvent::ScheduledBackfillWork` enum variant, which gets processed immediately or queued by the `BeaconProcessor` (existing logic applies from here)
Diagram comparing backfill processing with / without rate-limiting:
https://github.com/sigp/lighthouse/issues/3212#issuecomment-1386249922
See this comment for @paulhauner's explanation and solution: https://github.com/sigp/lighthouse/issues/3212#issuecomment-1384674956
## Additional Info
I've compared this branch (with backfill processing rate limited to to 1 and 3 batches per slot) against the latest stable version. The CPU usage during backfill sync is reduced by ~5% - 20%, more details on this page:
https://hackmd.io/@jimmygchen/SJuVpJL3j
The above testing is done on Goerli (as I don't currently have hardware for Mainnet), I'm guessing the differences are likely to be bigger on mainnet due to block size.
### TODOs
- [x] Experiment with processing multiple batches per slot. (need to think about how to do this for different slot durations)
- [x] Add option to disable rate-limiting, enabed by default.
- [x] (No longer required now we're reusing the reprocessing queue) Complete the `backfill_scheduler` task when backfill sync is completed or not required
## Issue Addressed
NA
## Proposed Changes
- Implements https://github.com/ethereum/consensus-specs/pull/3290/
- Bumps `ef-tests` to [v1.3.0-rc.4](https://github.com/ethereum/consensus-spec-tests/releases/tag/v1.3.0-rc.4).
The `CountRealizedFull` concept has been removed and the `--count-unrealized-full` and `--count-unrealized` BN flags now do nothing but log a `WARN` when used.
## Database Migration Debt
This PR removes the `best_justified_checkpoint` from fork choice. This field is persisted on-disk and the correct way to go about this would be to make a DB migration to remove the field. However, in this PR I've simply stubbed out the value with a junk value. I've taken this approach because if we're going to do a DB migration I'd love to remove the `Option`s around the justified and finalized checkpoints on `ProtoNode` whilst we're at it. Those options were added in #2822 which was included in Lighthouse v2.1.0. The options were only put there to handle the migration and they've been set to `Some` ever since v2.1.0. There's no reason to keep them as options anymore.
I started adding the DB migration to this branch but I started to feel like I was bloating this rather critical PR with nice-to-haves. I've kept the partially-complete migration [over in my repo](https://github.com/paulhauner/lighthouse/tree/fc-pr-18-migration) so we can pick it up after this PR is merged.
## Issue Addressed
Cleans up all the remnants of 4844 in capella. This makes sure when 4844 is reviewed there is nothing we are missing because it got included here
## Proposed Changes
drop a bomb on every 4844 thing
## Additional Info
Merge process I did (locally) is as follows:
- squash merge to produce one commit
- in new branch off unstable with the squashed commit create a `git revert HEAD` commit
- merge that new branch onto 4844 with `--strategy ours`
- compare local 4844 to remote 4844 and make sure the diff is empty
- enjoy
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
NA
## Proposed Changes
Updates our `ef_tests` to use: https://github.com/ethereum/consensus-specs/releases/tag/v1.3.0-rc.3
This required:
- Skipping a `merkle_proof_validity` test (see #4022)
- Account for the `eip4844` tests changing name to `deneb`
- My IDE did some Python linting during this change. It seemed simple and nice so I left it there.
## Additional Info
NA
* add historical summaries
* fix tree hash caching, disable the sanity slots test with fake crypto
* add ssz static HistoricalSummary
* only store historical summaries after capella
* Teach `UpdatePattern` about Capella
* Tidy EF tests
* Clippy
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible.
## Proposed Changes
I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client.
I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library).
Block hash verification is pretty quick, about 500us per block on my machine (mainnet).
The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL.
## Additional Info
This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it.
I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/2327
## Proposed Changes
This is an extension of some ideas I implemented while working on `tree-states`:
- Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over.
- Clean up `import_block` by splitting each part into `import_block_XXX`.
- Move some stuff off hot paths, specifically:
- Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming:
- Attestation observation
- Validator monitor updates
- Slasher updates
- Updating the shuffling cache
- Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms).
- Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache.
The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
#3704
## Proposed Changes
Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status.
## Additional Info
I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.
## Issue Addressed
Part of https://github.com/sigp/lighthouse/issues/3651.
## Proposed Changes
Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. https://github.com/sigp/lighthouse/pull/3693, https://github.com/sigp/lighthouse/pull/3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`.
Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. https://github.com/sigp/lighthouse/pull/3206).
## Additional Info
I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
## Issue Addressed
This PR addresses partially #3651
## Proposed Changes
This PR adds the following methods:
* a new method to trait `TreeHash`, `hash_tree_leaves` which returns all the Merkle leaves of the ssz object.
* a new method to `BeaconState`: `compute_merkle_proof` which generates a specific merkle proof for given depth and index by using the `hash_tree_leaves` as leaves function.
## Additional Info
Now here is some rationale on why I decided to go down this route: adding a new function to commonly used trait is a pain but was necessary to make sure we have all merkle leaves for every object, that is why I just added `hash_tree_leaves` in the trait and not `compute_merkle_proof` as well. although it would make sense it gives us code duplication/harder review time and we just need it from one specific object in one specific usecase so not worth the effort YET. In my humble opinion.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* add capella gossip boiler plate
* get everything compiling
Co-authored-by: realbigsean <sean@sigmaprime.io
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
* small cleanup
* small cleanup
* cargo fix + some test cleanup
* improve block production
* add fixme for potential panic
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/2371
## Proposed Changes
Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`.
With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required.
## Additional Info
In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards.
There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).
## Issue Addressed
Implements new optimistic sync test format from https://github.com/ethereum/consensus-specs/pull/2982.
## Proposed Changes
- Add parsing and runner support for the new test format.
- Extend the mock EL with a set of canned responses keyed by block hash. Although this doubles up on some of the existing functionality I think it's really nice to use compared to the `preloaded_responses` or static responses. I think we could write novel new opt sync tests using these primtives much more easily than the previous ones. Forks are natively supported, and different responses to `forkchoiceUpdated` and `newPayload` are also straight-forward.
## Additional Info
Blocked on merge of the spec PR and release of new test vectors.
## Issue Addressed
Which issue # does this PR address?
#2629
## Proposed Changes
Please list or describe the changes introduced by this PR.
1. ci would dowload the bls test cases from https://github.com/ethereum/bls12-381-tests/
2. all the bls test cases(except eth ones) would use cases in the archive from step one
3. The bls test cases from https://github.com/ethereum/consensus-spec-tests would stay there and no use . For the future , these bls test cases would be remove suggested from https://github.com/ethereum/consensus-spec-tests/issues/25 . So it would do no harm and compatible for future cases.
## Additional Info
Please provide any additional information. For example, future considerations
or information useful for reviewers.
Question:
I am not sure if I should implement tests about `deserialization_G1`, `deserialization_G2` and `hash_to_G2` for the issue.
## Issue Addressed
NA
## Proposed Changes
This PR removes duplicated block root computation.
Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`.
Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations *after* the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times.
## Additional Info
NA
## Issue Addressed
NA
## Proposed Changes
I've noticed that our block hashing times increase significantly after the merge. I did some flamegraph-ing and noticed that we're allocating a `Vec` for each byte of each execution payload transaction. This seems like unnecessary work and a bit of a fragmentation risk.
This PR switches to `SmallVec<[u8; 32]>` for the packed encoding of `TreeHash`. I believe this is a nice simple optimisation with no downside.
### Benchmarking
These numbers were computed using #3580 on my desktop (i7 hex-core). You can see a bit of noise in the numbers, that's probably just my computer doing other things. Generally I found this change takes the time from 10-11ms to 8-9ms. I can also see all the allocations disappear from flamegraph.
This is the block being benchmarked: https://beaconcha.in/slot/4704236
#### Before
```
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 980: 10.553003ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 981: 10.563737ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 982: 10.646352ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 983: 10.628532ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 984: 10.552112ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 985: 10.587778ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 986: 10.640526ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 987: 10.587243ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 988: 10.554748ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 989: 10.551111ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 990: 11.559031ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 991: 11.944827ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 992: 10.554308ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 993: 11.043397ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 994: 11.043315ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 995: 11.207711ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 996: 11.056246ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 997: 11.049706ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 998: 11.432449ms
[2022-09-15T21:44:19Z INFO lcli::block_root] Run 999: 11.149617ms
```
#### After
```
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 980: 14.011653ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 981: 8.925314ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 982: 8.849563ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 983: 8.893689ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 984: 8.902964ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 985: 8.942067ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 986: 8.907088ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 987: 9.346101ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 988: 8.96142ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 989: 9.366437ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 990: 9.809334ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 991: 9.541561ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 992: 11.143518ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 993: 10.821181ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 994: 9.855973ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 995: 10.941006ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 996: 9.596155ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 997: 9.121739ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 998: 9.090019ms
[2022-09-15T21:41:49Z INFO lcli::block_root] Run 999: 9.071885ms
```
## Additional Info
Please provide any additional information. For example, future considerations
or information useful for reviewers.
## Issue Addressed
Resolves#3448
## Proposed Changes
Removes a known failure that wasn't actually a known failure. The tests declare this block invalid and we refuse to import it due to `ExecutionPayloadError(UnverifiedNonOptimisticCandidate)`.
This is correct since there is only one "eth1" block included in this test and two are required to trigger the merge (pre- and post-TTD blocks). It is slot 1 (tick = 12s) when this block is imported so the import must be prevented by `SAFE_SLOTS_TO_IMPORT_OPTIMISTICALLY`.
I'm not sure where I got the idea in #3448 that this test needed retrospective checking, that seems like a false assumption in hindsight.
## Additional Info
- Blocked on #3464
## Proposed Changes
Update the invalid head tests so that they work with the current default fork choice configuration.
Thanks @realbigsean for fixing the persistence test and the EF tests.
Co-authored-by: realbigsean <sean@sigmaprime.io>
## Issue Addressed
https://github.com/sigp/lighthouse/issues/3091
Extends https://github.com/sigp/lighthouse/pull/3062, adding pre-bellatrix block support on blinded endpoints and allowing the normal proposal flow (local payload construction) on blinded endpoints. This resulted in better fallback logic because the VC will not have to switch endpoints on failure in the BN <> Builder API, the BN can just fallback immediately and without repeating block processing that it shouldn't need to. We can also keep VC fallback from the VC<>BN API's blinded endpoint to full endpoint.
## Proposed Changes
- Pre-bellatrix blocks on blinded endpoints
- Add a new `PayloadCache` to the execution layer
- Better fallback-from-builder logic
## Todos
- [x] Remove VC transition logic
- [x] Add logic to only enable builder flow after Merge transition finalization
- [x] Tests
- [x] Fix metrics
- [x] Rustdocs
Co-authored-by: Mac L <mjladson@pm.me>
Co-authored-by: realbigsean <sean@sigmaprime.io>
## Issue Addressed
NA
## Proposed Changes
There are scenarios where the only viable head will have an invalid execution payload, in this scenario the `get_head` function on `proto_array` will return an error. We must recover from this scenario by importing blocks from the network.
This PR stops `BeaconChain::recompute_head` from returning an error so that we can't accidentally start down-scoring peers or aborting block import just because the current head has an invalid payload.
## Reviewer Notes
The following changes are included:
1. Allow `fork_choice.get_head` to fail gracefully in `BeaconChain::process_block` when trying to update the `early_attester_cache`; simply don't add the block to the cache rather than aborting the entire process.
1. Don't return an error from `BeaconChain::recompute_head_at_current_slot` and `BeaconChain::recompute_head` to defensively prevent calling functions from aborting any process just because the fork choice function failed to run.
- This should have practically no effect, since most callers were still continuing if recomputing the head failed.
- The outlier is that the API will return 200 rather than a 500 when fork choice fails.
1. Add the `ProtoArrayForkChoice::set_all_blocks_to_optimistic` function to recover from the scenario where we've rebooted and the persisted fork choice has an invalid head.
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/3241
Closes https://github.com/sigp/lighthouse/issues/3242
## Proposed Changes
* [x] Implement logic to remove equivocating validators from fork choice per https://github.com/ethereum/consensus-specs/pull/2845
* [x] Update tests to v1.2.0-rc.1. The new test which exercises `equivocating_indices` is passing.
* [x] Pull in some SSZ abstractions from the `tree-states` branch that make implementing Vec-compatible encoding for types like `BTreeSet` and `BTreeMap`.
* [x] Implement schema upgrades and downgrades for the database (new schema version is V11).
* [x] Apply attester slashings from blocks to fork choice
## Additional Info
* This PR doesn't need the `BTreeMap` impl, but `tree-states` does, and I don't think there's any harm in keeping it. But I could also be convinced to drop it.
Blocked on #3322.
## Issue Addressed
Add a flag that optionally enables unrealized vote tracking. Would like to test out on testnets and benchmark differences in methods of vote tracking. This PR includes a DB schema upgrade to enable to new vote tracking style.
Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: sean <seananderson33@gmail.com>
Co-authored-by: Mac L <mjladson@pm.me>
## Issue Addressed
N/A
## Proposed Changes
Since Rust 1.62, we can use `#[derive(Default)]` on enums. ✨https://blog.rust-lang.org/2022/06/30/Rust-1.62.0.html#default-enum-variants
There are no changes to functionality in this PR, just replaced the `Default` trait implementation with `#[derive(Default)]`.