lighthouse/beacon_node/beacon_chain/src/pre_finalization_cache.rs
Michael Sproul bcdd960ab1 Separate execution payloads in the DB (#3157)
## Proposed Changes

Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database.

⚠️ **This is achieved in a backwards-incompatible way for networks that have already merged** ⚠️. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins.

The main changes are:

- New column in the database called `ExecPayload`, keyed by beacon block root.
- The `BeaconBlock` column now stores blinded blocks only.
- Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc.
- On finalization:
    - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks.
    - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states.
- Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134.
- The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call.
   - I've tested manually that it works on Kiln, using Geth and Nethermind.
   - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146.
   - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134.
- Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed.
- Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated).

## Additional Info

- [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller.
- [x] We should measure the latency of blocks-by-root and blocks-by-range responses.
- [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159)
- [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks.

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00

120 lines
4.7 KiB
Rust

use crate::{BeaconChain, BeaconChainError, BeaconChainTypes};
use itertools::process_results;
use lru::LruCache;
use parking_lot::Mutex;
use slog::debug;
use std::time::Duration;
use types::Hash256;
const BLOCK_ROOT_CACHE_LIMIT: usize = 512;
const LOOKUP_LIMIT: usize = 8;
const METRICS_TIMEOUT: Duration = Duration::from_millis(100);
/// Cache for rejecting attestations to blocks from before finalization.
///
/// It stores a collection of block roots that are pre-finalization and therefore not known to fork
/// choice in `verify_head_block_is_known` during attestation processing.
#[derive(Default)]
pub struct PreFinalizationBlockCache {
cache: Mutex<Cache>,
}
struct Cache {
/// Set of block roots that are known to be pre-finalization.
block_roots: LruCache<Hash256, ()>,
/// Set of block roots that are the subject of single block lookups.
in_progress_lookups: LruCache<Hash256, ()>,
}
impl Default for Cache {
fn default() -> Self {
Cache {
block_roots: LruCache::new(BLOCK_ROOT_CACHE_LIMIT),
in_progress_lookups: LruCache::new(LOOKUP_LIMIT),
}
}
}
impl<T: BeaconChainTypes> BeaconChain<T> {
/// Check whether the block with `block_root` is known to be pre-finalization.
///
/// The provided `block_root` is assumed to be unknown to fork choice. I.e., it
/// is not known to be a descendant of the finalized block.
///
/// Return `true` if the attestation to this block should be rejected outright,
/// return `false` if more information is needed from a single-block-lookup.
pub fn is_pre_finalization_block(&self, block_root: Hash256) -> Result<bool, BeaconChainError> {
let mut cache = self.pre_finalization_block_cache.cache.lock();
// Check the cache to see if we already know this pre-finalization block root.
if cache.block_roots.contains(&block_root) {
return Ok(true);
}
// Avoid repeating the disk lookup for blocks that are already subject to a network lookup.
// Sync will take care of de-duplicating the single block lookups.
if cache.in_progress_lookups.contains(&block_root) {
return Ok(false);
}
// 1. Check memory for a recent pre-finalization block.
let is_recent_finalized_block = self.with_head(|head| {
process_results(
head.beacon_state.rev_iter_block_roots(&self.spec),
|mut iter| iter.any(|(_, root)| root == block_root),
)
.map_err(BeaconChainError::BeaconStateError)
})?;
if is_recent_finalized_block {
cache.block_roots.put(block_root, ());
return Ok(true);
}
// 2. Check on disk.
if self.store.get_blinded_block(&block_root)?.is_some() {
cache.block_roots.put(block_root, ());
return Ok(true);
}
// 3. Check the network with a single block lookup.
cache.in_progress_lookups.put(block_root, ());
if cache.in_progress_lookups.len() == LOOKUP_LIMIT {
// NOTE: we expect this to occur sometimes if a lot of blocks that we look up fail to be
// imported for reasons other than being pre-finalization. The cache will eventually
// self-repair in this case by replacing old entries with new ones until all the failed
// blocks have been flushed out. Solving this issue isn't as simple as hooking the
// beacon processor's functions that handle failed blocks because we need the block root
// and it has been erased from the `BlockError` by that point.
debug!(
self.log,
"Pre-finalization lookup cache is full";
);
}
Ok(false)
}
pub fn pre_finalization_block_rejected(&self, block_root: Hash256) {
// Future requests can know that this block is invalid without having to look it up again.
let mut cache = self.pre_finalization_block_cache.cache.lock();
cache.in_progress_lookups.pop(&block_root);
cache.block_roots.put(block_root, ());
}
}
impl PreFinalizationBlockCache {
pub fn block_processed(&self, block_root: Hash256) {
// Future requests will find this block in fork choice, so no need to cache it in the
// ongoing lookup cache any longer.
self.cache.lock().in_progress_lookups.pop(&block_root);
}
pub fn contains(&self, block_root: Hash256) -> bool {
self.cache.lock().block_roots.contains(&block_root)
}
pub fn metrics(&self) -> Option<(usize, usize)> {
let cache = self.cache.try_lock_for(METRICS_TIMEOUT)?;
Some((cache.block_roots.len(), cache.in_progress_lookups.len()))
}
}