lighthouse/beacon_node/beacon_chain/src/state_advance_timer.rs
Paul Hauner be4e261e74 Use async code when interacting with EL (#3244)
## Overview

This rather extensive PR achieves two primary goals:

1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state.
2. Refactors fork choice, block production and block processing to `async` functions.

Additionally, it achieves:

- Concurrent forkchoice updates to the EL and cache pruning after a new head is selected.
- Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production.
- Concurrent per-block-processing and execution payload verification during block processing.
- The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?):
    - I had to do this to deal with sending blocks into spawned tasks.
    - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones.
    - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap.
    - Avoids cloning *all the blocks* in *every chain segment* during sync.
    - It also has the potential to clean up our code where we need to pass an *owned* block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅)
- The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs.

For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273

## Changes to `canonical_head` and `fork_choice`

Previously, the `BeaconChain` had two separate fields:

```
canonical_head: RwLock<Snapshot>,
fork_choice: RwLock<BeaconForkChoice>
```

Now, we have grouped these values under a single struct:

```
canonical_head: CanonicalHead {
  cached_head: RwLock<Arc<Snapshot>>,
  fork_choice: RwLock<BeaconForkChoice>
} 
```

Apart from ergonomics, the only *actual* change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously.

## Breaking Changes

### The `state` (root) field in the `finalized_checkpoint` SSE event

Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event:

1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`.
4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots.

Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)) it uses [`getStateRootFromBlockRoot`](de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)) which uses (1).

I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku.

## Notes for Reviewers

I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct.

I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking".

I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run *at least once* means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows *when* it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it.

I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around.

Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2.

You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests:
- Changing tests to be `tokio::async` tests.
- Adding `.await` to fork choice, block processing and block production functions.
- Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`.
- Wrapping `SignedBeaconBlock` in an `Arc`.
- In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant.

I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic.

Co-authored-by: Mac L <mjladson@pm.me>
2022-07-03 05:36:50 +00:00

475 lines
17 KiB
Rust

//! Provides a timer which runs in the tail-end of each slot and maybe advances the state of the
//! head block forward a single slot.
//!
//! This provides an optimization with the following benefits:
//!
//! 1. Removes the burden of a single, mandatory `per_slot_processing` call from the leading-edge of
//! block processing. This helps import blocks faster.
//! 2. Allows the node to learn of the shuffling for the next epoch, before the first block from
//! that epoch has arrived. This helps reduce gossip block propagation times.
//!
//! The downsides to this optimization are:
//!
//! 1. We are required to store an additional `BeaconState` for the head block. This consumes
//! memory.
//! 2. There's a possibility that the head block is never built upon, causing wasted CPU cycles.
use crate::validator_monitor::HISTORIC_EPOCHS as VALIDATOR_MONITOR_HISTORIC_EPOCHS;
use crate::{
beacon_chain::{ATTESTATION_CACHE_LOCK_TIMEOUT, BLOCK_PROCESSING_CACHE_LOCK_TIMEOUT},
snapshot_cache::StateAdvance,
BeaconChain, BeaconChainError, BeaconChainTypes,
};
use slog::{debug, error, warn, Logger};
use slot_clock::SlotClock;
use state_processing::per_slot_processing;
use std::sync::{
atomic::{AtomicBool, Ordering},
Arc,
};
use task_executor::TaskExecutor;
use tokio::time::{sleep, sleep_until, Instant};
use types::{AttestationShufflingId, EthSpec, Hash256, RelativeEpoch, Slot};
/// If the head slot is more than `MAX_ADVANCE_DISTANCE` from the current slot, then don't perform
/// the state advancement.
///
/// This avoids doing unnecessary work whilst the node is syncing or has perhaps been put to sleep
/// for some period of time.
const MAX_ADVANCE_DISTANCE: u64 = 4;
/// Similarly for fork choice: avoid the fork choice lookahead during sync.
const MAX_FORK_CHOICE_DISTANCE: u64 = 4;
#[derive(Debug)]
enum Error {
BeaconChain(BeaconChainError),
HeadMissingFromSnapshotCache(Hash256),
MaxDistanceExceeded {
current_slot: Slot,
head_slot: Slot,
},
StateAlreadyAdvanced {
block_root: Hash256,
},
BadStateSlot {
_state_slot: Slot,
_block_slot: Slot,
},
}
impl From<BeaconChainError> for Error {
fn from(e: BeaconChainError) -> Self {
Self::BeaconChain(e)
}
}
/// Provides a simple thread-safe lock to be used for task co-ordination. Practically equivalent to
/// `Mutex<()>`.
#[derive(Clone)]
struct Lock(Arc<AtomicBool>);
impl Lock {
/// Instantiate an unlocked self.
pub fn new() -> Self {
Self(Arc::new(AtomicBool::new(false)))
}
/// Lock self, returning `true` if the lock was already set.
pub fn lock(&self) -> bool {
self.0.fetch_or(true, Ordering::SeqCst)
}
/// Unlock self.
pub fn unlock(&self) {
self.0.store(false, Ordering::SeqCst);
}
}
/// Spawns the timer described in the module-level documentation.
pub fn spawn_state_advance_timer<T: BeaconChainTypes>(
executor: TaskExecutor,
beacon_chain: Arc<BeaconChain<T>>,
log: Logger,
) {
executor.spawn(
state_advance_timer(executor.clone(), beacon_chain, log),
"state_advance_timer",
);
}
/// Provides the timer described in the module-level documentation.
async fn state_advance_timer<T: BeaconChainTypes>(
executor: TaskExecutor,
beacon_chain: Arc<BeaconChain<T>>,
log: Logger,
) {
let is_running = Lock::new();
let slot_clock = &beacon_chain.slot_clock;
let slot_duration = slot_clock.slot_duration();
loop {
let duration_to_next_slot = match beacon_chain.slot_clock.duration_to_next_slot() {
Some(duration) => duration,
None => {
error!(log, "Failed to read slot clock");
// If we can't read the slot clock, just wait another slot.
sleep(slot_duration).await;
continue;
}
};
// Run the state advance 3/4 of the way through the slot (9s on mainnet).
let state_advance_offset = slot_duration / 4;
let state_advance_instant = if duration_to_next_slot > state_advance_offset {
Instant::now() + duration_to_next_slot - state_advance_offset
} else {
// Skip the state advance for the current slot and wait until the next one.
Instant::now() + duration_to_next_slot + slot_duration - state_advance_offset
};
// Run fork choice 23/24s of the way through the slot (11.5s on mainnet).
// We need to run after the state advance, so use the same condition as above.
let fork_choice_offset = slot_duration / 24;
let fork_choice_instant = if duration_to_next_slot > state_advance_offset {
Instant::now() + duration_to_next_slot - fork_choice_offset
} else {
Instant::now() + duration_to_next_slot + slot_duration - fork_choice_offset
};
// Wait for the state advance.
sleep_until(state_advance_instant).await;
// Compute the current slot here at approx 3/4 through the slot. Even though this slot is
// only used by fork choice we need to calculate it here rather than after the state
// advance, in case the state advance flows over into the next slot.
let current_slot = match beacon_chain.slot() {
Ok(slot) => slot,
Err(e) => {
warn!(
log,
"Unable to determine slot in state advance timer";
"error" => ?e
);
// If we can't read the slot clock, just wait another slot.
sleep(slot_duration).await;
continue;
}
};
// Only spawn the state advance task if the lock was previously free.
if !is_running.lock() {
let log = log.clone();
let beacon_chain = beacon_chain.clone();
let is_running = is_running.clone();
executor.spawn_blocking(
move || {
match advance_head(&beacon_chain, &log) {
Ok(()) => (),
Err(Error::BeaconChain(e)) => error!(
log,
"Failed to advance head state";
"error" => ?e
),
Err(Error::StateAlreadyAdvanced { block_root }) => debug!(
log,
"State already advanced on slot";
"block_root" => ?block_root
),
Err(Error::MaxDistanceExceeded {
current_slot,
head_slot,
}) => debug!(
log,
"Refused to advance head state";
"head_slot" => head_slot,
"current_slot" => current_slot,
),
other => warn!(
log,
"Did not advance head state";
"reason" => ?other
),
};
// Permit this blocking task to spawn again, next time the timer fires.
is_running.unlock();
},
"state_advance_blocking",
);
} else {
warn!(
log,
"State advance routine overloaded";
"msg" => "system resources may be overloaded"
)
}
// Run fork choice pre-emptively for the next slot. This processes most of the attestations
// from this slot off the hot path of block verification and production.
// Wait for the fork choice instant (which may already be past).
sleep_until(fork_choice_instant).await;
let log = log.clone();
let beacon_chain = beacon_chain.clone();
let next_slot = current_slot + 1;
executor.spawn(
async move {
// Don't run fork choice during sync.
if beacon_chain.best_slot() + MAX_FORK_CHOICE_DISTANCE < current_slot {
return;
}
if let Err(e) = beacon_chain.recompute_head_at_slot(next_slot).await {
warn!(
log,
"Error updating fork choice for next slot";
"error" => ?e,
"slot" => next_slot,
);
}
// Use a blocking task to avoid blocking the core executor whilst waiting for locks
// in `ForkChoiceSignalTx`.
beacon_chain.task_executor.clone().spawn_blocking(
move || {
// Signal block proposal for the next slot (if it happens to be waiting).
if let Some(tx) = &beacon_chain.fork_choice_signal_tx {
if let Err(e) = tx.notify_fork_choice_complete(next_slot) {
warn!(
log,
"Error signalling fork choice waiter";
"error" => ?e,
"slot" => next_slot,
);
}
}
},
"fork_choice_advance_signal_tx",
);
},
"fork_choice_advance",
);
}
}
/// Reads the `snapshot_cache` from the `beacon_chain` and attempts to take a clone of the
/// `BeaconState` of the head block. If it obtains this clone, the state will be advanced a single
/// slot then placed back in the `snapshot_cache` to be used for block verification.
///
/// See the module-level documentation for rationale.
fn advance_head<T: BeaconChainTypes>(
beacon_chain: &Arc<BeaconChain<T>>,
log: &Logger,
) -> Result<(), Error> {
let current_slot = beacon_chain.slot()?;
// These brackets ensure that the `head_slot` value is dropped before we run fork choice and
// potentially invalidate it.
//
// Fork-choice is not run *before* this function to avoid unnecessary calls whilst syncing.
{
let head_slot = beacon_chain.best_slot();
// Don't run this when syncing or if lagging too far behind.
if head_slot + MAX_ADVANCE_DISTANCE < current_slot {
return Err(Error::MaxDistanceExceeded {
current_slot,
head_slot,
});
}
}
let head_root = beacon_chain.head_beacon_block_root();
let (head_slot, head_state_root, mut state) = match beacon_chain
.snapshot_cache
.try_write_for(BLOCK_PROCESSING_CACHE_LOCK_TIMEOUT)
.ok_or(BeaconChainError::SnapshotCacheLockTimeout)?
.get_for_state_advance(head_root)
{
StateAdvance::AlreadyAdvanced => {
return Err(Error::StateAlreadyAdvanced {
block_root: head_root,
})
}
StateAdvance::BlockNotFound => return Err(Error::HeadMissingFromSnapshotCache(head_root)),
StateAdvance::State {
state,
state_root,
block_slot,
} => (block_slot, state_root, *state),
};
let initial_slot = state.slot();
let initial_epoch = state.current_epoch();
let state_root = if state.slot() == head_slot {
Some(head_state_root)
} else {
// Protect against advancing a state more than a single slot.
//
// Advancing more than one slot without storing the intermediate state would corrupt the
// database. Future works might store temporary, intermediate states inside this function.
return Err(Error::BadStateSlot {
_block_slot: head_slot,
_state_slot: state.slot(),
});
};
// Advance the state a single slot.
if let Some(summary) = per_slot_processing(&mut state, state_root, &beacon_chain.spec)
.map_err(BeaconChainError::from)?
{
// Expose Prometheus metrics.
if let Err(e) = summary.observe_metrics() {
error!(
log,
"Failed to observe epoch summary metrics";
"src" => "state_advance_timer",
"error" => ?e
);
}
// Only notify the validator monitor for recent blocks.
if state.current_epoch() + VALIDATOR_MONITOR_HISTORIC_EPOCHS as u64
>= current_slot.epoch(T::EthSpec::slots_per_epoch())
{
// Potentially create logs/metrics for locally monitored validators.
if let Err(e) = beacon_chain
.validator_monitor
.read()
.process_validator_statuses(state.current_epoch(), &summary, &beacon_chain.spec)
{
error!(
log,
"Unable to process validator statuses";
"error" => ?e
);
}
}
}
debug!(
log,
"Advanced head state one slot";
"head_root" => ?head_root,
"state_slot" => state.slot(),
"current_slot" => current_slot,
);
// Build the current epoch cache, to prepare to compute proposer duties.
state
.build_committee_cache(RelativeEpoch::Current, &beacon_chain.spec)
.map_err(BeaconChainError::from)?;
// Build the next epoch cache, to prepare to compute attester duties.
state
.build_committee_cache(RelativeEpoch::Next, &beacon_chain.spec)
.map_err(BeaconChainError::from)?;
// If the `pre_state` is in a later epoch than `state`, pre-emptively add the proposer shuffling
// for the state's current epoch and the committee cache for the state's next epoch.
if initial_epoch < state.current_epoch() {
// Update the proposer cache.
//
// We supply the `head_root` as the decision block since the prior `if` statement guarantees
// the head root is the latest block from the prior epoch.
beacon_chain
.beacon_proposer_cache
.lock()
.insert(
state.current_epoch(),
head_root,
state
.get_beacon_proposer_indices(&beacon_chain.spec)
.map_err(BeaconChainError::from)?,
state.fork(),
)
.map_err(BeaconChainError::from)?;
// Update the attester cache.
let shuffling_id = AttestationShufflingId::new(head_root, &state, RelativeEpoch::Next)
.map_err(BeaconChainError::from)?;
let committee_cache = state
.committee_cache(RelativeEpoch::Next)
.map_err(BeaconChainError::from)?;
beacon_chain
.shuffling_cache
.try_write_for(ATTESTATION_CACHE_LOCK_TIMEOUT)
.ok_or(BeaconChainError::AttestationCacheLockTimeout)?
.insert(shuffling_id.clone(), committee_cache);
debug!(
log,
"Primed proposer and attester caches";
"head_root" => ?head_root,
"next_epoch_shuffling_root" => ?shuffling_id.shuffling_decision_block,
"state_epoch" => state.current_epoch(),
"current_epoch" => current_slot.epoch(T::EthSpec::slots_per_epoch()),
);
}
// Apply the state to the attester cache, if the cache deems it interesting.
beacon_chain
.attester_cache
.maybe_cache_state(&state, head_root, &beacon_chain.spec)
.map_err(BeaconChainError::from)?;
let final_slot = state.slot();
// Insert the advanced state back into the snapshot cache.
beacon_chain
.snapshot_cache
.try_write_for(BLOCK_PROCESSING_CACHE_LOCK_TIMEOUT)
.ok_or(BeaconChainError::SnapshotCacheLockTimeout)?
.update_pre_state(head_root, state)
.ok_or(Error::HeadMissingFromSnapshotCache(head_root))?;
// If we have moved into the next slot whilst processing the state then this function is going
// to become ineffective and likely become a hindrance as we're stealing the tree hash cache
// from the snapshot cache (which may force the next block to rebuild a new one).
//
// If this warning occurs very frequently on well-resourced machines then we should consider
// starting it earlier in the slot. Otherwise, it's a good indication that the machine is too
// slow/overloaded and will be useful information for the user.
let starting_slot = current_slot;
let current_slot = beacon_chain.slot()?;
if starting_slot < current_slot {
warn!(
log,
"State advance too slow";
"head_root" => %head_root,
"advanced_slot" => final_slot,
"current_slot" => current_slot,
"starting_slot" => starting_slot,
"msg" => "system resources may be overloaded",
);
}
debug!(
log,
"Completed state advance";
"head_root" => ?head_root,
"advanced_slot" => final_slot,
"initial_slot" => initial_slot,
);
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn lock() {
let lock = Lock::new();
assert!(!lock.lock());
assert!(lock.lock());
assert!(lock.lock());
lock.unlock();
assert!(!lock.lock());
assert!(lock.lock());
}
}