lighthouse/beacon_node/beacon_chain/src
Michael Sproul 229f883968 Avoid parallel fork choice runs during sync (#3217)
## Issue Addressed

Fixes an issue that @paulhauner found with the v2.3.0 release candidate whereby the fork choice runs introduced by #3168 tripped over each other during sync:

```
May 24 23:06:40.542 WARN Error signalling fork choice waiter     slot: 3884129, error: ForkChoiceSignalOutOfOrder { current: Slot(3884131), latest: Slot(3884129) }, service: beacon
```

This can occur because fork choice is called from the state advance _and_ the per-slot task. When one of these runs takes a long time it can end up finishing after a run from a later slot, tripping the error above. The problem is resolved by not running either of these fork choice calls during sync.

Additionally, these parallel fork choice runs were causing issues in the database:

```
May 24 07:49:05.098 WARN Found a chain that should already have been pruned, head_slot: 92925, head_block_root: 0xa76c7bf1b98e54ed4b0d8686efcfdf853484e6c2a4c67e91cbf19e5ad1f96b17, service: beacon
May 24 07:49:05.101 WARN Database migration failed               error: HotColdDBError(FreezeSlotError { current_split_slot: Slot(92608), proposed_split_slot: Slot(92576) }), service: beacon
```

In this case, two fork choice calls triggering the finalization processing were being processed out of order due to differences in their processing time, causing the background migrator to try to advance finalization _backwards_ 😳. Removing the parallel fork choice runs from sync effectively addresses the issue, because these runs are most likely to have different finalized checkpoints (because of the speed at which fork choice advances during sync). In theory it's still possible to process updates out of order if any other fork choice runs end up completing out of order, but this should be much less common. Fixing out of order fork choice runs in general is difficult as it requires architectural changes like serialising fork choice updates through a single thread, or locking fork choice along with the head when it is mutated (https://github.com/sigp/lighthouse/pull/3175).

## Proposed Changes

* Don't run per-slot fork choice during sync (if head is older than 4 slots)
* Don't run state-advance fork choice during sync (if head is older than 4 slots)
* Check for monotonic finalization updates in the background migrator. This is a good defensive check to have, and I'm not sure why we didn't have it before (we may have had it and wrongly removed it).
2022-05-25 03:27:30 +00:00
..
attestation_verification Batch BLS verification for attestations (#2399) 2021-09-22 08:49:41 +00:00
schema_change Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
attestation_verification.rs Ignore attestations to finalized blocks (don't reject) (#3052) 2022-03-04 00:41:22 +00:00
attester_cache.rs Add early attester cache (#2872) 2022-01-11 01:35:55 +00:00
beacon_chain.rs Avoid parallel fork choice runs during sync (#3217) 2022-05-25 03:27:30 +00:00
beacon_fork_choice_store.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
beacon_proposer_cache.rs Prepare proposer (#3043) 2022-03-09 00:42:05 +00:00
beacon_snapshot.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
block_reward.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
block_times_cache.rs Add BlockTimesCache to allow additional block delay metrics (#2546) 2021-09-30 04:31:41 +00:00
block_verification.rs Fix Rust 1.61 clippy lints (#3192) 2022-05-20 05:02:13 +00:00
builder.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
chain_config.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
early_attester_cache.rs Prevent attestation to future blocks from early attester cache (#3183) 2022-05-17 01:51:25 +00:00
errors.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
eth1_chain.rs Removed PowBlock struct that never got used (#2813) 2021-12-02 14:29:20 +11:00
events.rs Implement API for block rewards (#2628) 2022-01-27 01:06:02 +00:00
execution_payload.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
fork_choice_signal.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
fork_revert.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
head_tracker.rs Altair consensus changes and refactors (#2279) 2021-07-09 06:15:32 +00:00
historical_blocks.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
lib.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
metrics.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
migrate.rs Avoid parallel fork choice runs during sync (#3217) 2022-05-25 03:27:30 +00:00
naive_aggregation_pool.rs Update to Rust 1.59 and 2021 edition (#3038) 2022-02-25 00:10:17 +00:00
observed_aggregates.rs v2.2.0 (#3139) 2022-04-05 02:53:09 +00:00
observed_attesters.rs Ensure doppelganger detects attestations in blocks (#2495) 2021-08-09 02:43:03 +00:00
observed_block_producers.rs Doppelganger detection (#2230) 2021-07-31 03:50:52 +00:00
observed_operations.rs Clippy 1.49.0 updates and dht persistence test fix (#2156) 2021-01-19 00:34:28 +00:00
persisted_beacon_chain.rs Fix head tracker concurrency bugs (#1771) 2020-10-19 05:58:39 +00:00
persisted_fork_choice.rs Optimise balances cache in case of skipped slots (#2849) 2021-12-13 23:35:57 +00:00
pre_finalization_cache.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
proposer_prep_service.rs Don't log crits for missing EE before Bellatrix (#3150) 2022-04-11 23:14:47 +00:00
schema_change.rs Remove DB migrations for legacy database schemas (#3181) 2022-05-17 04:54:39 +00:00
shuffling_cache.rs Advance state to next slot after importing block (#2174) 2021-02-15 07:17:52 +00:00
snapshot_cache.rs Separate execution payloads in the DB (#3157) 2022-05-12 00:42:17 +00:00
state_advance_timer.rs Avoid parallel fork choice runs during sync (#3217) 2022-05-25 03:27:30 +00:00
sync_committee_verification.rs Update to Rust 1.59 and 2021 edition (#3038) 2022-02-25 00:10:17 +00:00
test_utils.rs Run fork choice before block proposal (#3168) 2022-05-20 05:02:11 +00:00
timeout_rw_lock.rs Add flag to disable lock timeouts (#2714) 2021-10-19 00:30:40 +00:00
validator_monitor.rs Update to Rust 1.59 and 2021 edition (#3038) 2022-02-25 00:10:17 +00:00
validator_pubkey_cache.rs Remove DB migrations for legacy database schemas (#3181) 2022-05-17 04:54:39 +00:00