lighthouse/beacon_node/beacon_chain/src/lib.rs
Divma 8c69d57c2c Pause sync when EE is offline (#3428)
## Issue Addressed

#3032

## Proposed Changes

Pause sync when ee is offline. Changes include three main parts:
- Online/offline notification system
- Pause sync
- Resume sync

#### Online/offline notification system
- The engine state is now guarded behind a new struct `State` that ensures every change is correctly notified. Notifications are only sent if the state changes. The new `State` is behind a `RwLock` (as before) as the synchronization mechanism.
- The actual notification channel is a [tokio::sync::watch](https://docs.rs/tokio/latest/tokio/sync/watch/index.html) which ensures only the last value is in the receiver channel. This way we don't need to worry about message order etc.
- Sync waits for state changes concurrently with normal messages.

#### Pause Sync
Sync has four components, pausing is done differently in each:
- **Block lookups**: Disabled while in this state. We drop current requests and don't search for new blocks. Block lookups are infrequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it.
- **Parent lookups**: Disabled while in this state. We drop current requests and don't search for new parents. Parent lookups are even less frequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it.
- **Range**: Chains don't send batches for processing to the beacon processor. This is easily done by guarding the channel to the beacon processor and giving it access only if the ee is responsive. I find this the simplest and most powerful approach since we don't need to deal with new sync states and chain segments that are added while the ee is offline will follow the same logic without needing to synchronize a shared state among those. Another advantage of passive pause vs active pause is that we can still keep track of active advertised chain segments so that on resume we don't need to re-evaluate all our peers.
- **Backfill**: Not affected by ee states, we don't pause.

#### Resume Sync
- **Block lookups**: Enabled again.
- **Parent lookups**: Enabled again.
- **Range**: Active resume. Since the only real pause range does is not sending batches for processing, resume makes all chains that are holding read-for-processing batches send them.
- **Backfill**: Not affected by ee states, no need to resume.

## Additional Info

**QUESTION**: Originally I made this to notify and change on synced state, but @pawanjay176 on talks with @paulhauner concluded we only need to check online/offline states. The upcheck function mentions extra checks to have a very up to date sync status to aid the networking stack. However, the only need the networking stack would have is this one. I added a TODO to review if the extra check can be removed

Next gen of #3094

Will work best with #3439 

Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
2022-08-24 23:34:56 +00:00

74 lines
2.5 KiB
Rust

#![recursion_limit = "128"] // For lazy-static
pub mod attestation_verification;
mod attester_cache;
mod beacon_chain;
mod beacon_fork_choice_store;
pub mod beacon_proposer_cache;
mod beacon_snapshot;
pub mod block_reward;
mod block_times_cache;
mod block_verification;
pub mod builder;
pub mod canonical_head;
pub mod chain_config;
mod early_attester_cache;
mod errors;
pub mod eth1_chain;
pub mod events;
pub mod execution_payload;
pub mod fork_choice_signal;
pub mod fork_revert;
mod head_tracker;
pub mod historical_blocks;
pub mod merge_readiness;
mod metrics;
pub mod migrate;
mod naive_aggregation_pool;
mod observed_aggregates;
mod observed_attesters;
mod observed_block_producers;
pub mod observed_operations;
pub mod otb_verification_service;
mod persisted_beacon_chain;
mod persisted_fork_choice;
mod pre_finalization_cache;
pub mod proposer_prep_service;
pub mod schema_change;
mod shuffling_cache;
mod snapshot_cache;
pub mod state_advance_timer;
pub mod sync_committee_verification;
pub mod test_utils;
mod timeout_rw_lock;
pub mod validator_monitor;
pub mod validator_pubkey_cache;
pub use self::beacon_chain::{
AttestationProcessingOutcome, BeaconChain, BeaconChainTypes, BeaconStore, ChainSegmentResult,
CountUnrealized, ForkChoiceError, ProduceBlockVerification, StateSkipConfig, WhenSlotSkipped,
INVALID_FINALIZED_MERGE_TRANSITION_BLOCK_SHUTDOWN_REASON,
INVALID_JUSTIFIED_PAYLOAD_SHUTDOWN_REASON, MAXIMUM_GOSSIP_CLOCK_DISPARITY,
};
pub use self::beacon_snapshot::BeaconSnapshot;
pub use self::chain_config::ChainConfig;
pub use self::errors::{BeaconChainError, BlockProductionError};
pub use self::historical_blocks::HistoricalBlockError;
pub use attestation_verification::Error as AttestationError;
pub use beacon_fork_choice_store::{BeaconForkChoiceStore, Error as ForkChoiceStoreError};
pub use block_verification::{BlockError, ExecutionPayloadError, GossipVerifiedBlock};
pub use canonical_head::{CachedHead, CanonicalHead, CanonicalHeadRwLock};
pub use eth1_chain::{Eth1Chain, Eth1ChainBackend};
pub use events::ServerSentEventHandler;
pub use execution_layer::EngineState;
pub use fork_choice::{ExecutionStatus, ForkchoiceUpdateParameters};
pub use metrics::scrape_for_metrics;
pub use parking_lot;
pub use slot_clock;
pub use state_processing::per_block_processing::errors::{
AttestationValidationError, AttesterSlashingValidationError, DepositValidationError,
ExitValidationError, ProposerSlashingValidationError,
};
pub use store;
pub use timeout_rw_lock::TimeoutRwLock;
pub use types;