Super Silky Smooth Syncs, like a Sir (#1628)

## Issue Addressed In principle.. closes #1551 but in general are improvements for performance, maintainability and readability. The logic for the optimistic sync in actually simple ## Proposed Changes There are miscellaneous things here: - Remove unnecessary `BatchProcessResult::Partial` to simplify the batch validation logic - Make batches a state machine. This is done to ensure batch state transitions respect our logic (this was previously done by moving batches between `Vec`s) and to ease the cognitive load of the `SyncingChain` struct - Move most batch-related logic to the batch - Remove `PendingBatches` in favor of a map of peers to their batches. This is to avoid duplicating peers inside the chain (peer_pool and pending_batches) - Add `must_use` decoration to the `ProcessingResult` so that chains that request to be removed are handled accordingly. This also means that chains are now removed in more places than before to account for unhandled cases - Store batches in a sorted map (`BTreeMap`) access is not O(1) but since the number of _active_ batches is bounded this should be fast, and saves performing hashing ops. Batches are indexed by the epoch they start. Sorted, to easily handle chain advancements (range logic) - Produce the chain Id from the identifying fields: target root and target slot. This, to guarantee there can't be duplicated chains and be able to consistently search chains by either Id or checkpoint - Fix chain_id not being present in all chain loggers - Handle mega-edge case where the processor's work queue is full and the batch can't be sent. In this case the chain would lose the blocks, remain in a "syncing" state and waiting for a result that won't arrive, effectively stalling sync. - When a batch imports blocks or the chain starts syncing with a local finalized epoch greater that the chain's start epoch, the chain is advanced instead of reset. This is to avoid losing download progress and validate batches faster. This also means that the old `start_epoch` now means "current first unvalidated batch", so it represents more accurately the progress of the chain. - Batch status peers from the same chain to reduce Arc access. - Handle a couple of cases where the retry counters for a batch were not updated/checked are now handled via the batch state machine. Basically now if we forget to do it, we will know. - Do not send back the blocks from the processor to the batch. Instead register the attempt before sending the blocks (does not count as failed) - When re-requesting a batch, try to avoid not only the last failed peer, but all previous failed peers. - Optimize requesting batches ahead in the buffer by shuffling idle peers just once (this is just addressing a couple of old TODOs in the code) - In chain_collection, store chains by their id in a map - Include a mapping from request_ids to (chain, batch) that requested the batch to avoid the double O(n) search on block responses - Other stuff: - impl `slog::KV` for batches - impl `slog::KV` for syncing chains - PSA: when logging, we can use `%thing` if `thing` implements `Display`. Same for `?` and `Debug` ### Optimistic syncing: Try first the batch that contains the current head, if the batch imports any block, advance the chain. If not, if this optimistic batch is inside the current processing window leave it there for future use, if not drop it. The tolerance for this block is the same for downloading, but just once for processing Co-authored-by: Age Manning <Age@AgeManning.com>
2020-09-23 06:29:55 +00:00 · 2020-09-23 06:29:55 +00:00 · b8013b7b2c
commit b8013b7b2c
parent 80e52a0263
13 changed files with 1480 additions and 1199 deletions
--- a/beacon_node/eth2_libp2p/Cargo.toml
+++ b/beacon_node/eth2_libp2p/Cargo.toml
@ -35,7 +35,6 @@ tokio-util = { version = "0.3.1", features = ["codec", "compat"] }
 discv5 = { version = "0.1.0-alpha.10", features = ["libp2p"] }
 tiny-keccak = "2.0.2"
 environment = { path = "../../lighthouse/environment" }
 # TODO: Remove rand crate for mainnet
 rand = "0.7.3"
 regex = "1.3.9"
--- a/beacon_node/network/Cargo.toml
+++ b/beacon_node/network/Cargo.toml
@ -32,6 +32,7 @@ tokio = { version = "0.2.21", features = ["full"] }
 parking_lot = "0.11.0"
 smallvec = "1.4.1"
 # TODO: Remove rand crate for mainnet
 # NOTE: why?
 rand = "0.7.3"
 fnv = "1.0.6"
 rlp = "0.4.5"
--- a/beacon_node/network/src/beacon_processor/chain_segment.rs
+++ b/beacon_node/network/src/beacon_processor/chain_segment.rs
@ -28,39 +28,26 @@ pub fn handle_chain_segment<T: BeaconChainTypes>(
    match process_id {
        // this a request from the range sync
        ProcessId::RangeBatchId(chain_id, epoch) => {
-            let len = downloaded_blocks.len();
+            let start_slot = downloaded_blocks.first().map(|b| b.message.slot.as_u64());
-            let start_slot = if len > 0 {
+            let end_slot = downloaded_blocks.last().map(|b| b.message.slot.as_u64());
-                downloaded_blocks[0].message.slot.as_u64()
+            let sent_blocks = downloaded_blocks.len();
            } else {
                0
            };
            let end_slot = if len > 0 {
                downloaded_blocks[len - 1].message.slot.as_u64()
            } else {
                0
            };
            debug!(log, "Processing batch"; "batch_epoch" => epoch, "blocks" => downloaded_blocks.len(),  "first_block_slot" => start_slot, "last_block_slot" => end_slot, "service" => "sync");
            let result = match process_blocks(chain, downloaded_blocks.iter(), &log) {
                (_, Ok(_)) => {
-                    debug!(log, "Batch processed"; "batch_epoch" => epoch , "first_block_slot" => start_slot, "last_block_slot" => end_slot, "service"=> "sync");
+                    debug!(log, "Batch processed"; "batch_epoch" => epoch, "first_block_slot" => start_slot,
-                    BatchProcessResult::Success
+                        "last_block_slot" => end_slot, "processed_blocks" => sent_blocks, "service"=> "sync");
                    BatchProcessResult::Success(sent_blocks > 0)
                }
-                (imported_blocks, Err(e)) if imported_blocks > 0 => {
+                (imported_blocks, Err(e)) => {
-                    debug!(log, "Batch processing failed but imported some blocks";
+                    debug!(log, "Batch processing failed"; "batch_epoch" => epoch, "first_block_slot" => start_slot,
-                        "batch_epoch" => epoch, "error" => e, "imported_blocks"=> imported_blocks, "service" => "sync");
+                        "last_block_slot" => end_slot, "error" => e, "imported_blocks" => imported_blocks, "service" => "sync");
-                    BatchProcessResult::Partial
+                    BatchProcessResult::Failed(imported_blocks > 0)
                }
                (_, Err(e)) => {
                    debug!(log, "Batch processing failed"; "batch_epoch" => epoch, "error" => e, "service" => "sync");
                    BatchProcessResult::Failed
                }
            };
            let msg = SyncMessage::BatchProcessed {
                chain_id,
                epoch,
                downloaded_blocks,
                result,
            };
            sync_send.send(msg).unwrap_or_else(|_| {
@ -70,7 +57,7 @@ pub fn handle_chain_segment<T: BeaconChainTypes>(
                );
            });
        }
-        // this a parent lookup request from the sync manager
+        // this is a parent lookup request from the sync manager
        ProcessId::ParentLookup(peer_id, chain_head) => {
            debug!(
                log, "Processing parent lookup";
@ -81,7 +68,7 @@ pub fn handle_chain_segment<T: BeaconChainTypes>(
            // reverse
            match process_blocks(chain, downloaded_blocks.iter().rev(), &log) {
                (_, Err(e)) => {
-                    debug!(log, "Parent lookup failed"; "last_peer_id" => format!("{}", peer_id), "error" => e);
+                    debug!(log, "Parent lookup failed"; "last_peer_id" => %peer_id, "error" => e);
                    sync_send
                        .send(SyncMessage::ParentLookupFailed{peer_id, chain_head})
                        .unwrap_or_else(|_| {
@ -114,13 +101,7 @@ fn process_blocks<
    match chain.process_chain_segment(blocks) {
        ChainSegmentResult::Successful { imported_blocks } => {
            metrics::inc_counter(&metrics::BEACON_PROCESSOR_CHAIN_SEGMENT_SUCCESS_TOTAL);
-            if imported_blocks == 0 {
+            if imported_blocks > 0 {
                debug!(log, "All blocks already known");
            } else {
                debug!(
                    log, "Imported blocks from network";
                    "count" => imported_blocks,
                );
                // Batch completed successfully with at least one block, run fork choice.
                run_fork_choice(chain, log);
            }
@ -153,7 +134,7 @@ fn run_fork_choice<T: BeaconChainTypes>(chain: Arc<BeaconChain<T>>, log: &slog::
        Err(e) => error!(
            log,
            "Fork choice failed";
-            "error" => format!("{:?}", e),
+            "error" => ?e,
            "location" => "batch import error"
        ),
    }
@ -219,7 +200,7 @@ fn handle_failed_chain_segment<T: EthSpec>(
            warn!(
                log, "BlockProcessingFailure";
                "msg" => "unexpected condition in processing block.",
-                "outcome" => format!("{:?}", e)
+                "outcome" => ?e,
            );
            Err(format!("Internal error whilst processing block: {:?}", e))
@ -228,7 +209,7 @@ fn handle_failed_chain_segment<T: EthSpec>(
            debug!(
                log, "Invalid block received";
                "msg" => "peer sent invalid block",
-                "outcome" => format!("{:?}", other),
+                "outcome" => %other,
            );
            Err(format!("Peer sent invalid block. Reason: {:?}", other))
--- a/beacon_node/network/src/beacon_processor/worker.rs
+++ b/beacon_node/network/src/beacon_processor/worker.rs
@ -535,9 +535,10 @@ impl<T: BeaconChainTypes> Worker<T> {
    ///
    /// Creates a log if there is an interal error.
    fn send_sync_message(&self, message: SyncMessage<T::EthSpec>) {
-        self.sync_tx
+        self.sync_tx.send(message).unwrap_or_else(|e| {
-            .send(message)
+            error!(self.log, "Could not send message to the sync service";
-            .unwrap_or_else(|_| error!(self.log, "Could not send message to the sync service"));
+                "error" => %e)
        });
    }
    /// Handle an error whilst verifying an `Attestation` or `SignedAggregateAndProof` from the
--- a/beacon_node/network/src/router/processor.rs
+++ b/beacon_node/network/src/router/processor.rs
@ -82,10 +82,11 @@ impl<T: BeaconChainTypes> Processor<T> {
    }
    fn send_to_sync(&mut self, message: SyncMessage<T::EthSpec>) {
-        self.sync_send.send(message).unwrap_or_else(|_| {
+        self.sync_send.send(message).unwrap_or_else(|e| {
            warn!(
                self.log,
                "Could not send message to the sync service";
                "error" => %e,
            )
        });
    }
@ -691,9 +692,10 @@ impl<T: EthSpec> HandlerNetworkContext<T> {
    /// Sends a message to the network task.
    fn inform_network(&mut self, msg: NetworkMessage<T>) {
        let msg_r = &format!("{:?}", msg);
        self.network_send
            .send(msg)
-            .unwrap_or_else(|_| warn!(self.log, "Could not send message to the network service"))
+            .unwrap_or_else(|e| warn!(self.log, "Could not send message to the network service"; "error" => %e, "message" => msg_r))
    }
    /// Disconnects and ban's a peer, sending a Goodbye request with the associated reason.
--- a/beacon_node/network/src/sync/manager.rs
+++ b/beacon_node/network/src/sync/manager.rs
@ -29,9 +29,9 @@
 //!
 //! Block Lookup
 //!
-//! To keep the logic maintained to the syncing thread (and manage the request_ids), when a block needs to be searched for (i.e
+//! To keep the logic maintained to the syncing thread (and manage the request_ids), when a block
-//! if an attestation references an unknown block) this manager can search for the block and
+//! needs to be searched for (i.e if an attestation references an unknown block) this manager can
-//! subsequently search for parents if needed.
+//! search for the block and subsequently search for parents if needed.
 use super::network_context::SyncNetworkContext;
 use super::peer_sync_info::{PeerSyncInfo, PeerSyncType};
@ -106,7 +106,6 @@ pub enum SyncMessage<T: EthSpec> {
    BatchProcessed {
        chain_id: ChainId,
        epoch: Epoch,
        downloaded_blocks: Vec<SignedBeaconBlock<T>>,
        result: BatchProcessResult,
    },
@ -123,12 +122,10 @@ pub enum SyncMessage<T: EthSpec> {
 // TODO: When correct batch error handling occurs, we will include an error type.
 #[derive(Debug)]
 pub enum BatchProcessResult {
-    /// The batch was completed successfully.
+    /// The batch was completed successfully. It carries whether the sent batch contained blocks.
-    Success,
+    Success(bool),
-    /// The batch processing failed.
+    /// The batch processing failed. It carries whether the processing imported any block.
-    Failed,
+    Failed(bool),
    /// The batch processing failed but managed to import at least one block.
    Partial,
 }
 /// Maintains a sequential list of parents to lookup and the lookup's current state.
@ -275,9 +272,9 @@ impl<T: BeaconChainTypes> SyncManager<T> {
        match local_peer_info.peer_sync_type(&remote) {
            PeerSyncType::FullySynced => {
                trace!(self.log, "Peer synced to our head found";
-                "peer" => format!("{:?}", peer_id),
+                    "peer" => %peer_id,
-                "peer_head_slot" => remote.head_slot,
+                    "peer_head_slot" => remote.head_slot,
-                "local_head_slot" => local_peer_info.head_slot,
+                    "local_head_slot" => local_peer_info.head_slot,
                );
                self.synced_peer(&peer_id, remote);
                // notify the range sync that a peer has been added
@ -285,11 +282,11 @@ impl<T: BeaconChainTypes> SyncManager<T> {
            }
            PeerSyncType::Advanced => {
                trace!(self.log, "Useful peer for sync found";
-                "peer" => format!("{:?}", peer_id),
+                    "peer" => %peer_id,
-                "peer_head_slot" => remote.head_slot,
+                    "peer_head_slot" => remote.head_slot,
-                "local_head_slot" => local_peer_info.head_slot,
+                    "local_head_slot" => local_peer_info.head_slot,
-                "peer_finalized_epoch" => remote.finalized_epoch,
+                    "peer_finalized_epoch" => remote.finalized_epoch,
-                "local_finalized_epoch" => local_peer_info.finalized_epoch,
+                    "local_finalized_epoch" => local_peer_info.finalized_epoch,
                );
                // There are few cases to handle here:
@ -908,14 +905,12 @@ impl<T: BeaconChainTypes> SyncManager<T> {
                    SyncMessage::BatchProcessed {
                        chain_id,
                        epoch,
                        downloaded_blocks,
                        result,
                    } => {
                        self.range_sync.handle_block_process_result(
                            &mut self.network,
                            chain_id,
                            epoch,
                            downloaded_blocks,
                            result,
                        );
                    }
--- a/beacon_node/network/src/sync/network_context.rs
+++ b/beacon_node/network/src/sync/network_context.rs
@ -1,11 +1,14 @@
 //! Provides network functionality for the Syncing thread. This fundamentally wraps a network
 //! channel and stores a global RPC ID to perform requests.
 use super::range_sync::{BatchId, ChainId};
 use super::RequestId as SyncRequestId;
 use crate::router::processor::status_message;
 use crate::service::NetworkMessage;
 use beacon_chain::{BeaconChain, BeaconChainTypes};
 use eth2_libp2p::rpc::{BlocksByRangeRequest, BlocksByRootRequest, GoodbyeReason, RequestId};
 use eth2_libp2p::{Client, NetworkGlobals, PeerAction, PeerId, Request};
 use fnv::FnvHashMap;
 use slog::{debug, trace, warn};
 use std::sync::Arc;
 use tokio::sync::mpsc;
@ -21,7 +24,11 @@ pub struct SyncNetworkContext<T: EthSpec> {
    network_globals: Arc<NetworkGlobals<T>>,
    /// A sequential ID for all RPC requests.
-    request_id: usize,
+    request_id: SyncRequestId,
    /// BlocksByRange requests made by range syncing chains.
    range_requests: FnvHashMap<SyncRequestId, (ChainId, BatchId)>,
    /// Logger for the `SyncNetworkContext`.
    log: slog::Logger,
 }
@ -36,6 +43,7 @@ impl<T: EthSpec> SyncNetworkContext<T> {
            network_send,
            network_globals,
            request_id: 1,
            range_requests: FnvHashMap::default(),
            log,
        }
    }
@ -50,24 +58,26 @@ impl<T: EthSpec> SyncNetworkContext<T> {
            .unwrap_or_default()
    }
-    pub fn status_peer<U: BeaconChainTypes>(
+    pub fn status_peers<U: BeaconChainTypes>(
        &mut self,
        chain: Arc<BeaconChain<U>>,
-        peer_id: PeerId,
+        peers: impl Iterator<Item = PeerId>,
    ) {
        if let Some(status_message) = status_message(&chain) {
-            debug!(
+            for peer_id in peers {
-                self.log,
+                debug!(
-                "Sending Status Request";
+                    self.log,
-                "peer" => format!("{:?}", peer_id),
+                    "Sending Status Request";
-                "fork_digest" => format!("{:?}", status_message.fork_digest),
+                    "peer" => %peer_id,
-                "finalized_root" => format!("{:?}", status_message.finalized_root),
+                    "fork_digest" => ?status_message.fork_digest,
-                "finalized_epoch" => format!("{:?}", status_message.finalized_epoch),
+                    "finalized_root" => ?status_message.finalized_root,
-                "head_root" => format!("{}", status_message.head_root),
+                    "finalized_epoch" => ?status_message.finalized_epoch,
-                "head_slot" => format!("{}", status_message.head_slot),
+                    "head_root" => %status_message.head_root,
-            );
+                    "head_slot" => %status_message.head_slot,
                );
-            let _ = self.send_rpc_request(peer_id, Request::Status(status_message));
+                let _ = self.send_rpc_request(peer_id, Request::Status(status_message.clone()));
            }
        }
    }
@ -75,15 +85,34 @@ impl<T: EthSpec> SyncNetworkContext<T> {
        &mut self,
        peer_id: PeerId,
        request: BlocksByRangeRequest,
-    ) -> Result<usize, &'static str> {
+        chain_id: ChainId,
        batch_id: BatchId,
    ) -> Result<(), &'static str> {
        trace!(
            self.log,
            "Sending BlocksByRange Request";
            "method" => "BlocksByRange",
            "count" => request.count,
-            "peer" => format!("{:?}", peer_id)
+            "peer" => %peer_id,
        );
-        self.send_rpc_request(peer_id, Request::BlocksByRange(request))
+        let req_id = self.send_rpc_request(peer_id, Request::BlocksByRange(request))?;
        self.range_requests.insert(req_id, (chain_id, batch_id));
        Ok(())
    }
    pub fn blocks_by_range_response(
        &mut self,
        request_id: usize,
        remove: bool,
    ) -> Option<(ChainId, BatchId)> {
        // NOTE: we can't guarantee that the request must be registered as it could receive more
        // than an error, and be removed after receiving the first one.
        // FIXME: https://github.com/sigp/lighthouse/issues/1634
        if remove {
            self.range_requests.remove(&request_id)
        } else {
            self.range_requests.get(&request_id).cloned()
        }
    }
    pub fn blocks_by_root_request(
--- a/beacon_node/network/src/sync/range_sync/batch.rs
+++ b/beacon_node/network/src/sync/range_sync/batch.rs
@ -1,35 +1,274 @@
-use super::chain::EPOCHS_PER_BATCH;
+use eth2_libp2p::rpc::methods::BlocksByRangeRequest;
 use eth2_libp2p::rpc::methods::*;
 use eth2_libp2p::PeerId;
 use fnv::FnvHashMap;
 use ssz::Encode;
-use std::cmp::min;
+use std::collections::HashSet;
 use std::cmp::Ordering;
 use std::collections::hash_map::Entry;
 use std::collections::{HashMap, HashSet};
 use std::hash::{Hash, Hasher};
 use std::ops::Sub;
 use types::{Epoch, EthSpec, SignedBeaconBlock, Slot};
-/// A collection of sequential blocks that are requested from peers in a single RPC request.
+/// The number of times to retry a batch before it is considered failed.
-#[derive(PartialEq, Debug)]
+const MAX_BATCH_DOWNLOAD_ATTEMPTS: u8 = 5;
-pub struct Batch<T: EthSpec> {
+
-    /// The requested start epoch of the batch.
+/// Invalid batches are attempted to be re-downloaded from other peers. If a batch cannot be processed
-    pub start_epoch: Epoch,
+/// after `MAX_BATCH_PROCESSING_ATTEMPTS` times, it is considered faulty.
-    /// The requested end slot of batch, exclusive.
+const MAX_BATCH_PROCESSING_ATTEMPTS: u8 = 3;
-    pub end_slot: Slot,
+
-    /// The `Attempts` that have been made to send us this batch.
+/// A segment of a chain.
-    pub attempts: Vec<Attempt>,
+pub struct BatchInfo<T: EthSpec> {
-    /// The peer that is currently assigned to the batch.
+    /// Start slot of the batch.
-    pub current_peer: PeerId,
+    start_slot: Slot,
-    /// The number of retries this batch has undergone due to a failed request.
+    /// End slot of the batch.
-    /// This occurs when peers do not respond or we get an RPC error.
+    end_slot: Slot,
-    pub retries: u8,
+    /// The `Attempts` that have been made and failed to send us this batch.
-    /// The number of times this batch has attempted to be re-downloaded and re-processed. This
+    failed_processing_attempts: Vec<Attempt>,
-    /// occurs when a batch has been received but cannot be processed.
+    /// The number of download retries this batch has undergone due to a failed request.
-    pub reprocess_retries: u8,
+    failed_download_attempts: Vec<PeerId>,
-    /// The blocks that have been downloaded.
+    /// State of the batch.
-    pub downloaded_blocks: Vec<SignedBeaconBlock<T>>,
+    state: BatchState<T>,
 }
 /// Current state of a batch
 pub enum BatchState<T: EthSpec> {
    /// The batch has failed either downloading or processing, but can be requested again.
    AwaitingDownload,
    /// The batch is being downloaded.
    Downloading(PeerId, Vec<SignedBeaconBlock<T>>),
    /// The batch has been completely downloaded and is ready for processing.
    AwaitingProcessing(PeerId, Vec<SignedBeaconBlock<T>>),
    /// The batch is being processed.
    Processing(Attempt),
    /// The batch was successfully processed and is waiting to be validated.
    ///
    /// It is not sufficient to process a batch successfully to consider it correct. This is
    /// because batches could be erroneously empty, or incomplete. Therefore, a batch is considered
    /// valid, only if the next sequential batch imports at least a block.
    AwaitingValidation(Attempt),
    /// Intermediate state for inner state handling.
    Poisoned,
    /// The batch has maxed out the allowed attempts for either downloading or processing. It
    /// cannot be recovered.
    Failed,
 }
 impl<T: EthSpec> BatchState<T> {
    /// Helper function for poisoning a state.
    pub fn poison(&mut self) -> BatchState<T> {
        std::mem::replace(self, BatchState::Poisoned)
    }
 }
 impl<T: EthSpec> BatchInfo<T> {
    /// Batches are downloaded excluding the first block of the epoch assuming it has already been
    /// downloaded.
    ///
    /// For example:
    ///
    /// Epoch boundary |                                   |
    ///  ... | 30 | 31 | 32 | 33 | 34 | ... | 61 | 62 | 63 | 64 | 65 |
    ///       Batch 1       |              Batch 2              |  Batch 3
    pub fn new(start_epoch: &Epoch, num_of_epochs: u64) -> Self {
        let start_slot = start_epoch.start_slot(T::slots_per_epoch()) + 1;
        let end_slot = start_slot + num_of_epochs * T::slots_per_epoch();
        BatchInfo {
            start_slot,
            end_slot,
            failed_processing_attempts: Vec::new(),
            failed_download_attempts: Vec::new(),
            state: BatchState::AwaitingDownload,
        }
    }
    /// Gives a list of peers from which this batch has had a failed download or processing
    /// attempt.
    pub fn failed_peers(&self) -> HashSet<PeerId> {
        let mut peers = HashSet::with_capacity(
            self.failed_processing_attempts.len() + self.failed_download_attempts.len(),
        );
        for attempt in &self.failed_processing_attempts {
            peers.insert(attempt.peer_id.clone());
        }
        for download in &self.failed_download_attempts {
            peers.insert(download.clone());
        }
        peers
    }
    pub fn current_peer(&self) -> Option<&PeerId> {
        match &self.state {
            BatchState::AwaitingDownload | BatchState::Failed => None,
            BatchState::Downloading(peer_id, _)
            | BatchState::AwaitingProcessing(peer_id, _)
            | BatchState::Processing(Attempt { peer_id, .. })
            | BatchState::AwaitingValidation(Attempt { peer_id, .. }) => Some(&peer_id),
            BatchState::Poisoned => unreachable!("Poisoned batch"),
        }
    }
    pub fn to_blocks_by_range_request(&self) -> BlocksByRangeRequest {
        BlocksByRangeRequest {
            start_slot: self.start_slot.into(),
            count: self.end_slot.sub(self.start_slot).into(),
            step: 1,
        }
    }
    pub fn state(&self) -> &BatchState<T> {
        &self.state
    }
    pub fn attempts(&self) -> &[Attempt] {
        &self.failed_processing_attempts
    }
    /// Adds a block to a downloading batch.
    pub fn add_block(&mut self, block: SignedBeaconBlock<T>) {
        match self.state.poison() {
            BatchState::Downloading(peer, mut blocks) => {
                blocks.push(block);
                self.state = BatchState::Downloading(peer, blocks)
            }
            other => unreachable!("Add block for batch in wrong state: {:?}", other),
        }
    }
    /// Marks the batch as ready to be processed if the blocks are in the range. The number of
    /// received blocks is returned, or the wrong batch end on failure
    #[must_use = "Batch may have failed"]
    pub fn download_completed(
        &mut self,
    ) -> Result<
        usize, /* Received blocks */
        (
            Slot, /* expected slot */
            Slot, /* received slot */
            &BatchState<T>,
        ),
    > {
        match self.state.poison() {
            BatchState::Downloading(peer, blocks) => {
                // verify that blocks are in range
                if let Some(last_slot) = blocks.last().map(|b| b.slot()) {
                    // the batch is non-empty
                    let first_slot = blocks[0].slot();
                    let failed_range = if first_slot < self.start_slot {
                        Some((self.start_slot, first_slot))
                    } else if self.end_slot < last_slot {
                        Some((self.end_slot, last_slot))
                    } else {
                        None
                    };
                    if let Some(range) = failed_range {
                        // this is a failed download, register the attempt and check if the batch
                        // can be tried again
                        self.failed_download_attempts.push(peer);
                        self.state = if self.failed_download_attempts.len()
                            >= MAX_BATCH_DOWNLOAD_ATTEMPTS as usize
                        {
                            BatchState::Failed
                        } else {
                            // drop the blocks
                            BatchState::AwaitingDownload
                        };
                        return Err((range.0, range.1, &self.state));
                    }
                }
                let received = blocks.len();
                self.state = BatchState::AwaitingProcessing(peer, blocks);
                Ok(received)
            }
            other => unreachable!("Download completed for batch in wrong state: {:?}", other),
        }
    }
    #[must_use = "Batch may have failed"]
    pub fn download_failed(&mut self) -> &BatchState<T> {
        match self.state.poison() {
            BatchState::Downloading(peer, _) => {
                // register the attempt and check if the batch can be tried again
                self.failed_download_attempts.push(peer);
                self.state = if self.failed_download_attempts.len()
                    >= MAX_BATCH_DOWNLOAD_ATTEMPTS as usize
                {
                    BatchState::Failed
                } else {
                    // drop the blocks
                    BatchState::AwaitingDownload
                };
                &self.state
            }
            other => unreachable!("Download failed for batch in wrong state: {:?}", other),
        }
    }
    pub fn start_downloading_from_peer(&mut self, peer: PeerId) {
        match self.state.poison() {
            BatchState::AwaitingDownload => {
                self.state = BatchState::Downloading(peer, Vec::new());
            }
            other => unreachable!("Starting download for batch in wrong state: {:?}", other),
        }
    }
    pub fn start_processing(&mut self) -> Vec<SignedBeaconBlock<T>> {
        match self.state.poison() {
            BatchState::AwaitingProcessing(peer, blocks) => {
                self.state = BatchState::Processing(Attempt::new(peer, &blocks));
                blocks
            }
            other => unreachable!("Start processing for batch in wrong state: {:?}", other),
        }
    }
    #[must_use = "Batch may have failed"]
    pub fn processing_completed(&mut self, was_sucessful: bool) -> &BatchState<T> {
        match self.state.poison() {
            BatchState::Processing(attempt) => {
                self.state = if !was_sucessful {
                    // register the failed attempt
                    self.failed_processing_attempts.push(attempt);
                    // check if the batch can be downloaded again
                    if self.failed_processing_attempts.len()
                        >= MAX_BATCH_PROCESSING_ATTEMPTS as usize
                    {
                        BatchState::Failed
                    } else {
                        BatchState::AwaitingDownload
                    }
                } else {
                    BatchState::AwaitingValidation(attempt)
                };
                &self.state
            }
            other => unreachable!("Processing completed for batch in wrong state: {:?}", other),
        }
    }
    #[must_use = "Batch may have failed"]
    pub fn validation_failed(&mut self) -> &BatchState<T> {
        match self.state.poison() {
            BatchState::AwaitingValidation(attempt) => {
                self.failed_processing_attempts.push(attempt);
                // check if the batch can be downloaded again
                self.state = if self.failed_processing_attempts.len()
                    >= MAX_BATCH_PROCESSING_ATTEMPTS as usize
                {
                    BatchState::Failed
                } else {
                    BatchState::AwaitingDownload
                };
                &self.state
            }
            other => unreachable!("Validation failed for batch in wrong state: {:?}", other),
        }
    }
 }
 /// Represents a peer's attempt and providing the result for this batch.
@ -43,131 +282,61 @@ pub struct Attempt {
    pub hash: u64,
 }
-impl<T: EthSpec> Eq for Batch<T> {}
+impl Attempt {
-
+    #[allow(clippy::ptr_arg)]
-impl<T: EthSpec> Batch<T> {
+    fn new<T: EthSpec>(peer_id: PeerId, blocks: &Vec<SignedBeaconBlock<T>>) -> Self {
    pub fn new(start_epoch: Epoch, end_slot: Slot, peer_id: PeerId) -> Self {
        Batch {
            start_epoch,
            end_slot,
            attempts: Vec::new(),
            current_peer: peer_id,
            retries: 0,
            reprocess_retries: 0,
            downloaded_blocks: Vec::new(),
        }
    }
    pub fn start_slot(&self) -> Slot {
        // batches are shifted by 1
        self.start_epoch.start_slot(T::slots_per_epoch()) + 1
    }
    pub fn end_slot(&self) -> Slot {
        self.end_slot
    }
    pub fn to_blocks_by_range_request(&self) -> BlocksByRangeRequest {
        let start_slot = self.start_slot();
        BlocksByRangeRequest {
            start_slot: start_slot.into(),
            count: min(
                T::slots_per_epoch() * EPOCHS_PER_BATCH,
                self.end_slot.sub(start_slot).into(),
            ),
            step: 1,
        }
    }
    /// This gets a hash that represents the blocks currently downloaded. This allows comparing a
    /// previously downloaded batch of blocks with a new downloaded batch of blocks.
    pub fn hash(&self) -> u64 {
        // the hash used is the ssz-encoded list of blocks
        let mut hasher = std::collections::hash_map::DefaultHasher::new();
-        self.downloaded_blocks.as_ssz_bytes().hash(&mut hasher);
+        blocks.as_ssz_bytes().hash(&mut hasher);
-        hasher.finish()
+        let hash = hasher.finish();
        Attempt { peer_id, hash }
    }
 }
-impl<T: EthSpec> Ord for Batch<T> {
+impl<T: EthSpec> slog::KV for &mut BatchInfo<T> {
-    fn cmp(&self, other: &Self) -> Ordering {
+    fn serialize(
-        self.start_epoch.cmp(&other.start_epoch)
+        &self,
        record: &slog::Record,
        serializer: &mut dyn slog::Serializer,
    ) -> slog::Result {
        slog::KV::serialize(*self, record, serializer)
    }
 }
-impl<T: EthSpec> PartialOrd for Batch<T> {
+impl<T: EthSpec> slog::KV for BatchInfo<T> {
-    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
+    fn serialize(
-        Some(self.cmp(other))
+        &self,
        record: &slog::Record,
        serializer: &mut dyn slog::Serializer,
    ) -> slog::Result {
        use slog::Value;
        Value::serialize(&self.start_slot, record, "start_slot", serializer)?;
        Value::serialize(
            &(self.end_slot - 1), // NOTE: The -1 shows inclusive blocks
            record,
            "end_slot",
            serializer,
        )?;
        serializer.emit_usize("downloaded", self.failed_download_attempts.len())?;
        serializer.emit_usize("processed", self.failed_processing_attempts.len())?;
        serializer.emit_str("state", &format!("{:?}", self.state))?;
        slog::Result::Ok(())
    }
 }
-/// A structure that contains a mapping of pending batch requests, that also keeps track of which
+impl<T: EthSpec> std::fmt::Debug for BatchState<T> {
-/// peers are currently making batch requests.
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
-///
+        match self {
-/// This is used to optimise searches for idle peers (peers that have no outbound batch requests).
+            BatchState::Processing(_) => f.write_str("Processing"),
-pub struct PendingBatches<T: EthSpec> {
+            BatchState::AwaitingValidation(_) => f.write_str("AwaitingValidation"),
-    /// The current pending batches.
+            BatchState::AwaitingDownload => f.write_str("AwaitingDownload"),
-    batches: FnvHashMap<usize, Batch<T>>,
+            BatchState::Failed => f.write_str("Failed"),
-    /// A mapping of peers to the number of pending requests.
+            BatchState::AwaitingProcessing(ref peer, ref blocks) => {
-    peer_requests: HashMap<PeerId, HashSet<usize>>,
+                write!(f, "AwaitingProcessing({}, {} blocks)", peer, blocks.len())
 }
 impl<T: EthSpec> PendingBatches<T> {
    pub fn new() -> Self {
        PendingBatches {
            batches: FnvHashMap::default(),
            peer_requests: HashMap::new(),
        }
    }
    pub fn insert(&mut self, request_id: usize, batch: Batch<T>) -> Option<Batch<T>> {
        let peer_request = batch.current_peer.clone();
        self.peer_requests
            .entry(peer_request)
            .or_insert_with(HashSet::new)
            .insert(request_id);
        self.batches.insert(request_id, batch)
    }
    pub fn remove(&mut self, request_id: usize) -> Option<Batch<T>> {
        if let Some(batch) = self.batches.remove(&request_id) {
            if let Entry::Occupied(mut entry) = self.peer_requests.entry(batch.current_peer.clone())
            {
                entry.get_mut().remove(&request_id);
                if entry.get().is_empty() {
                    entry.remove();
                }
            }
-            Some(batch)
+            BatchState::Downloading(peer, blocks) => {
-        } else {
+                write!(f, "Downloading({}, {} blocks)", peer, blocks.len())
-            None
+            }
            BatchState::Poisoned => f.write_str("Poisoned"),
        }
    }
    /// The number of current pending batch requests.
    pub fn len(&self) -> usize {
        self.batches.len()
    }
    /// Adds a block to the batches if the request id exists. Returns None if there is no batch
    /// matching the request id.
    pub fn add_block(&mut self, request_id: usize, block: SignedBeaconBlock<T>) -> Option<()> {
        let batch = self.batches.get_mut(&request_id)?;
        batch.downloaded_blocks.push(block);
        Some(())
    }
    /// Returns true if there the peer does not exist in the peer_requests mapping. Indicating it
    /// has no pending outgoing requests.
    pub fn peer_is_idle(&self, peer_id: &PeerId) -> bool {
        self.peer_requests.get(peer_id).is_none()
    }
    /// Removes a batch for a given peer.
    pub fn remove_batch_by_peer(&mut self, peer_id: &PeerId) -> Option<Batch<T>> {
        let request_ids = self.peer_requests.get(peer_id)?;
        let request_id = *request_ids.iter().next()?;
        self.remove(request_id)
    }
 }
--- a/beacon_node/network/src/sync/range_sync/chain.rs
+++ b/beacon_node/network/src/sync/range_sync/chain.rs
--- a/beacon_node/network/src/sync/range_sync/chain_collection.rs
+++ b/beacon_node/network/src/sync/range_sync/chain_collection.rs
@ -1,15 +1,18 @@
 //! This provides the logic for the finalized and head chains.
 //!
-//! Each chain type is stored in it's own vector. A variety of helper functions are given along
+//! Each chain type is stored in it's own map. A variety of helper functions are given along with
-//! with this struct to to simplify the logic of the other layers of sync.
+//! this struct to simplify the logic of the other layers of sync.
-use super::chain::{ChainSyncingState, SyncingChain};
+use super::chain::{ChainId, ChainSyncingState, ProcessingResult, SyncingChain};
 use super::sync_type::RangeSyncType;
 use crate::beacon_processor::WorkEvent as BeaconWorkEvent;
 use crate::sync::network_context::SyncNetworkContext;
 use crate::sync::PeerSyncInfo;
 use beacon_chain::{BeaconChain, BeaconChainTypes};
 use eth2_libp2p::{types::SyncState, NetworkGlobals, PeerId};
-use slog::{debug, error, info, o};
+use fnv::FnvHashMap;
 use slog::{crit, debug, error, info, trace};
 use std::collections::hash_map::Entry;
 use std::sync::Arc;
 use tokio::sync::mpsc;
 use types::EthSpec;
@ -83,9 +86,9 @@ pub struct ChainCollection<T: BeaconChainTypes> {
    /// A reference to the global network parameters.
    network_globals: Arc<NetworkGlobals<T::EthSpec>>,
    /// The set of finalized chains being synced.
-    finalized_chains: Vec<SyncingChain<T>>,
+    finalized_chains: FnvHashMap<ChainId, SyncingChain<T>>,
    /// The set of head chains being synced.
-    head_chains: Vec<SyncingChain<T>>,
+    head_chains: FnvHashMap<ChainId, SyncingChain<T>>,
    /// The current sync state of the process.
    state: RangeSyncState,
    /// Logger for the collection.
@ -101,8 +104,8 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        ChainCollection {
            beacon_chain,
            network_globals,
-            finalized_chains: Vec::new(),
+            finalized_chains: FnvHashMap::default(),
-            head_chains: Vec::new(),
+            head_chains: FnvHashMap::default(),
            state: RangeSyncState::Idle,
            log,
        }
@ -129,7 +132,7 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
                .unwrap_or_else(|| SyncState::Stalled);
            let mut peer_state = self.network_globals.sync_state.write();
            if new_state != *peer_state {
-                info!(self.log, "Sync state updated"; "old_state" => format!("{}",peer_state), "new_state" => format!("{}",new_state));
+                info!(self.log, "Sync state updated"; "old_state" => %peer_state, "new_state" => %new_state);
                if new_state == SyncState::Synced {
                    network.subscribe_core_topics();
                }
@ -141,7 +144,7 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
            let new_state: SyncState = self.state.clone().into();
            if *node_sync_state != new_state {
                // we are updating the state, inform the user
-                info!(self.log, "Sync state updated"; "old_state" => format!("{}",node_sync_state), "new_state" => format!("{}",new_state));
+                info!(self.log, "Sync state updated"; "old_state" => %node_sync_state, "new_state" => %new_state);
            }
            *node_sync_state = new_state;
        }
@ -182,30 +185,67 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        }
    }
-    /// Finds any finalized chain if it exists.
+    /// Calls `func` on every chain of the collection. If the result is
-    pub fn get_finalized_mut(
+    /// `ProcessingResult::RemoveChain`, the chain is removed and returned.
-        &mut self,
+    pub fn call_all<F>(&mut self, mut func: F) -> Vec<(SyncingChain<T>, RangeSyncType)>
-        target_head_root: Hash256,
+    where
-        target_head_slot: Slot,
+        F: FnMut(&mut SyncingChain<T>) -> ProcessingResult,
-    ) -> Option<&mut SyncingChain<T>> {
+    {
-        ChainCollection::get_chain(
+        let mut to_remove = Vec::new();
-            self.finalized_chains.as_mut(),
+
-            target_head_root,
+        for (id, chain) in self.finalized_chains.iter_mut() {
-            target_head_slot,
+            if let ProcessingResult::RemoveChain = func(chain) {
-        )
+                to_remove.push((*id, RangeSyncType::Finalized));
            }
        }
        for (id, chain) in self.head_chains.iter_mut() {
            if let ProcessingResult::RemoveChain = func(chain) {
                to_remove.push((*id, RangeSyncType::Head));
            }
        }
        let mut results = Vec::with_capacity(to_remove.len());
        for (id, sync_type) in to_remove.into_iter() {
            let chain = match sync_type {
                RangeSyncType::Finalized => self.finalized_chains.remove(&id),
                RangeSyncType::Head => self.head_chains.remove(&id),
            };
            results.push((chain.expect("Chain exits"), sync_type));
        }
        results
    }
-    /// Finds any finalized chain if it exists.
+    /// Executes a function on the chain with the given id.
-    pub fn get_head_mut(
+    ///
    /// If the function returns `ProcessingResult::RemoveChain`, the chain is removed and returned.
    /// If the chain is found, its syncing type is returned, or an error otherwise.
    pub fn call_by_id<F>(
        &mut self,
-        target_head_root: Hash256,
+        id: ChainId,
-        target_head_slot: Slot,
+        func: F,
-    ) -> Option<&mut SyncingChain<T>> {
+    ) -> Result<(Option<SyncingChain<T>>, RangeSyncType), ()>
-        ChainCollection::get_chain(
+    where
-            self.head_chains.as_mut(),
+        F: FnOnce(&mut SyncingChain<T>) -> ProcessingResult,
-            target_head_root,
+    {
-            target_head_slot,
+        if let Entry::Occupied(mut entry) = self.finalized_chains.entry(id) {
-        )
+            // Search in our finalized chains first
            if let ProcessingResult::RemoveChain = func(entry.get_mut()) {
                Ok((Some(entry.remove()), RangeSyncType::Finalized))
            } else {
                Ok((None, RangeSyncType::Finalized))
            }
        } else if let Entry::Occupied(mut entry) = self.head_chains.entry(id) {
            // Search in our head chains next
            if let ProcessingResult::RemoveChain = func(entry.get_mut()) {
                Ok((Some(entry.remove()), RangeSyncType::Head))
            } else {
                Ok((None, RangeSyncType::Head))
            }
        } else {
            // Chain was not found in the finalized collection, nor the head collection
            Err(())
        }
    }
    /// Updates the state of the chain collection.
@ -214,9 +254,8 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
    /// updates the state of the collection. This starts head chains syncing if any are required to
    /// do so.
    pub fn update(&mut self, network: &mut SyncNetworkContext<T::EthSpec>) {
-        let local_epoch = {
+        let (local_finalized_epoch, local_head_epoch) =
-            let local = match PeerSyncInfo::from_chain(&self.beacon_chain) {
+            match PeerSyncInfo::from_chain(&self.beacon_chain) {
                Some(local) => local,
                None => {
                    return error!(
                        self.log,
@ -224,20 +263,21 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
                        "msg" => "likely due to head lock contention"
                    )
                }
                Some(local) => (
                    local.finalized_epoch,
                    local.head_slot.epoch(T::EthSpec::slots_per_epoch()),
                ),
            };
            local.finalized_epoch
        };
        // Remove any outdated finalized/head chains
        self.purge_outdated_chains(network);
        // Choose the best finalized chain if one needs to be selected.
-        self.update_finalized_chains(network, local_epoch);
+        self.update_finalized_chains(network, local_finalized_epoch, local_head_epoch);
-        if self.finalized_syncing_index().is_none() {
+        if self.finalized_syncing_chain().is_none() {
            // Handle head syncing chains if there are no finalized chains left.
-            self.update_head_chains(network, local_epoch);
+            self.update_head_chains(network, local_finalized_epoch, local_head_epoch);
        }
    }
@ -247,53 +287,57 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        &mut self,
        network: &mut SyncNetworkContext<T::EthSpec>,
        local_epoch: Epoch,
        local_head_epoch: Epoch,
    ) {
-        // Check if any chains become the new syncing chain
+        // Find the chain with most peers and check if it is already syncing
-        if let Some(index) = self.finalized_syncing_index() {
+        if let Some((new_id, peers)) = self
            // There is a current finalized chain syncing
            let _syncing_chain_peer_count = self.finalized_chains[index].peer_pool.len();
            // search for a chain with more peers
            if let Some((new_index, chain)) =
                self.finalized_chains
                    .iter_mut()
                    .enumerate()
                    .find(|(_iter_index, _chain)| {
                        false
                        //    && *iter_index != index
                        //    && chain.peer_pool.len() > syncing_chain_peer_count
                    })
            {
                // A chain has more peers. Swap the syncing chain
                debug!(self.log, "Switching finalized chains to sync"; "new_target_root" => format!("{}", chain.target_head_root), "new_end_slot" => chain.target_head_slot, "new_start_epoch"=> local_epoch);
                // update the state to a new finalized state
                let state = RangeSyncState::Finalized {
                    start_slot: chain.start_epoch.start_slot(T::EthSpec::slots_per_epoch()),
                    head_slot: chain.target_head_slot,
                    head_root: chain.target_head_root,
                };
                self.state = state;
                // Stop the current chain from syncing
                self.finalized_chains[index].stop_syncing();
                // Start the new chain
                self.finalized_chains[new_index].start_syncing(network, local_epoch);
            }
        } else if let Some(chain) = self
            .finalized_chains
-            .iter_mut()
+            .iter()
-            .max_by_key(|chain| chain.peer_pool.len())
+            .max_by_key(|(_, chain)| chain.available_peers())
            .map(|(id, chain)| (*id, chain.available_peers()))
        {
-            // There is no currently syncing finalization chain, starting the one with the most peers
+            let old_id = self.finalized_syncing_chain().map(
-            debug!(self.log, "New finalized chain started syncing"; "new_target_root" => format!("{}", chain.target_head_root), "new_end_slot" => chain.target_head_slot, "new_start_epoch"=> chain.start_epoch);
+                |(currently_syncing_id, currently_syncing_chain)| {
-            chain.start_syncing(network, local_epoch);
+                    if *currently_syncing_id != new_id
                        && peers > currently_syncing_chain.available_peers()
                    {
                        currently_syncing_chain.stop_syncing();
                        // we stop this chain and start syncing the one with more peers
                        Some(*currently_syncing_id)
                    } else {
                        // the best chain is already the syncing chain, advance it if possible
                        None
                    }
                },
            );
            let chain = self
                .finalized_chains
                .get_mut(&new_id)
                .expect("Chain exists");
            match old_id {
                Some(Some(old_id)) => debug!(self.log, "Switching finalized chains";
                        "old_id" => old_id, &chain),
                None => debug!(self.log, "Syncing new chain"; &chain),
                Some(None) => trace!(self.log, "Advancing currently syncing chain"),
                // this is the same chain. We try to advance it.
            }
            // update the state to a new finalized state
            let state = RangeSyncState::Finalized {
                start_slot: chain.start_epoch.start_slot(T::EthSpec::slots_per_epoch()),
                head_slot: chain.target_head_slot,
                head_root: chain.target_head_root,
            };
            self.state = state;
            if let ProcessingResult::RemoveChain =
                chain.start_syncing(network, local_epoch, local_head_epoch)
            {
                // this happens only if sending a batch over the `network` fails a lot
                error!(self.log, "Chain removed while switching chains");
                self.finalized_chains.remove(&new_id);
            }
        }
    }
@ -302,6 +346,7 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        &mut self,
        network: &mut SyncNetworkContext<T::EthSpec>,
        local_epoch: Epoch,
        local_head_epoch: Epoch,
    ) {
        // There are no finalized chains, update the state.
        if self.head_chains.is_empty() {
@ -311,42 +356,41 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        let mut currently_syncing = self
            .head_chains
-            .iter()
+            .values()
            .filter(|chain| chain.is_syncing())
            .count();
        let mut not_syncing = self.head_chains.len() - currently_syncing;
        // Find all head chains that are not currently syncing ordered by peer count.
        while currently_syncing <= PARALLEL_HEAD_CHAINS && not_syncing > 0 {
            // Find the chain with the most peers and start syncing
-            if let Some((_index, chain)) = self
+            if let Some((_id, chain)) = self
                .head_chains
                .iter_mut()
-                .filter(|chain| !chain.is_syncing())
+                .filter(|(_id, chain)| !chain.is_syncing())
-                .enumerate()
+                .max_by_key(|(_id, chain)| chain.available_peers())
                .max_by_key(|(_index, chain)| chain.peer_pool.len())
            {
                // start syncing this chain
-                debug!(self.log, "New head chain started syncing"; "new_target_root" => format!("{}", chain.target_head_root), "new_end_slot" => chain.target_head_slot, "new_start_epoch"=> chain.start_epoch);
+                debug!(self.log, "New head chain started syncing"; &chain);
-                chain.start_syncing(network, local_epoch);
+                if let ProcessingResult::RemoveChain =
                    chain.start_syncing(network, local_epoch, local_head_epoch)
                {
                    error!(self.log, "Chain removed while switching head chains")
                }
            }
            // update variables
            currently_syncing = self
                .head_chains
                .iter()
-                .filter(|chain| chain.is_syncing())
+                .filter(|(_id, chain)| chain.is_syncing())
                .count();
            not_syncing = self.head_chains.len() - currently_syncing;
        }
        // Start
        // for the syncing API, we find the minimal start_slot and the maximum
        // target_slot of all head chains to report back.
        let (min_epoch, max_slot) = self
            .head_chains
-            .iter()
+            .values()
            .filter(|chain| chain.is_syncing())
            .fold(
                (Epoch::from(0u64), Slot::from(0u64)),
@ -368,10 +412,9 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
    /// chains and re-status their peers.
    pub fn clear_head_chains(&mut self, network: &mut SyncNetworkContext<T::EthSpec>) {
        let log_ref = &self.log;
-        self.head_chains.retain(|chain| {
+        self.head_chains.retain(|_id, chain| {
-            if !chain.is_syncing()
+            if !chain.is_syncing() {
-                {
+                debug!(log_ref, "Removing old head chain"; &chain);
                debug!(log_ref, "Removing old head chain"; "start_epoch" => chain.start_epoch, "end_slot" => chain.target_head_slot);
                chain.status_peers(network);
                false
            } else {
@ -380,140 +423,20 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        });
    }
    /// Add a new finalized chain to the collection.
    pub fn new_finalized_chain(
        &mut self,
        local_finalized_epoch: Epoch,
        target_head: Hash256,
        target_slot: Slot,
        peer_id: PeerId,
        beacon_processor_send: mpsc::Sender<BeaconWorkEvent<T::EthSpec>>,
    ) {
        let chain_id = rand::random();
        self.finalized_chains.push(SyncingChain::new(
            chain_id,
            local_finalized_epoch,
            target_slot,
            target_head,
            peer_id,
            beacon_processor_send,
            self.beacon_chain.clone(),
            self.log.new(o!("chain" => chain_id)),
        ));
    }
    /// Add a new finalized chain to the collection and starts syncing it.
    #[allow(clippy::too_many_arguments)]
    pub fn new_head_chain(
        &mut self,
        remote_finalized_epoch: Epoch,
        target_head: Hash256,
        target_slot: Slot,
        peer_id: PeerId,
        beacon_processor_send: mpsc::Sender<BeaconWorkEvent<T::EthSpec>>,
    ) {
        // remove the peer from any other head chains
        self.head_chains.iter_mut().for_each(|chain| {
            chain.peer_pool.remove(&peer_id);
        });
        self.head_chains.retain(|chain| !chain.peer_pool.is_empty());
        let chain_id = rand::random();
        let new_head_chain = SyncingChain::new(
            chain_id,
            remote_finalized_epoch,
            target_slot,
            target_head,
            peer_id,
            beacon_processor_send,
            self.beacon_chain.clone(),
            self.log.clone(),
        );
        self.head_chains.push(new_head_chain);
    }
    /// Returns if `true` if any finalized chains exist, `false` otherwise.
    pub fn is_finalizing_sync(&self) -> bool {
        !self.finalized_chains.is_empty()
    }
    /// Given a chain iterator, runs a given function on each chain until the function returns
    /// `Some`. This allows the `RangeSync` struct to loop over chains and optionally remove the
    /// chain from the collection if the function results in completing the chain.
    fn request_function<'a, F, I, U>(chain: I, mut func: F) -> Option<(usize, U)>
    where
        I: Iterator<Item = &'a mut SyncingChain<T>>,
        F: FnMut(&'a mut SyncingChain<T>) -> Option<U>,
    {
        chain
            .enumerate()
            .find_map(|(index, chain)| Some((index, func(chain)?)))
    }
    /// Given a chain iterator, runs a given function on each chain and return all `Some` results.
    fn request_function_all<'a, F, I, U>(chain: I, mut func: F) -> Vec<(usize, U)>
    where
        I: Iterator<Item = &'a mut SyncingChain<T>>,
        F: FnMut(&'a mut SyncingChain<T>) -> Option<U>,
    {
        chain
            .enumerate()
            .filter_map(|(index, chain)| Some((index, func(chain)?)))
            .collect()
    }
    /// Runs a function on finalized chains until we get the first `Some` result from `F`.
    pub fn finalized_request<F, U>(&mut self, func: F) -> Option<(usize, U)>
    where
        F: FnMut(&mut SyncingChain<T>) -> Option<U>,
    {
        ChainCollection::request_function(self.finalized_chains.iter_mut(), func)
    }
    /// Runs a function on head chains until we get the first `Some` result from `F`.
    pub fn head_request<F, U>(&mut self, func: F) -> Option<(usize, U)>
    where
        F: FnMut(&mut SyncingChain<T>) -> Option<U>,
    {
        ChainCollection::request_function(self.head_chains.iter_mut(), func)
    }
    /// Runs a function on finalized and head chains until we get the first `Some` result from `F`.
    pub fn head_finalized_request<F, U>(&mut self, func: F) -> Option<(usize, U)>
    where
        F: FnMut(&mut SyncingChain<T>) -> Option<U>,
    {
        ChainCollection::request_function(
            self.finalized_chains
                .iter_mut()
                .chain(self.head_chains.iter_mut()),
            func,
        )
    }
    /// Runs a function on all finalized and head chains and collects all `Some` results from `F`.
    pub fn head_finalized_request_all<F, U>(&mut self, func: F) -> Vec<(usize, U)>
    where
        F: FnMut(&mut SyncingChain<T>) -> Option<U>,
    {
        ChainCollection::request_function_all(
            self.finalized_chains
                .iter_mut()
                .chain(self.head_chains.iter_mut()),
            func,
        )
    }
    /// Removes any outdated finalized or head chains.
    ///
    /// This removes chains with no peers, or chains whose start block slot is less than our current
    /// finalized block slot.
    pub fn purge_outdated_chains(&mut self, network: &mut SyncNetworkContext<T::EthSpec>) {
        // Remove any chains that have no peers
        self.finalized_chains
-            .retain(|chain| !chain.peer_pool.is_empty());
+            .retain(|_id, chain| chain.available_peers() > 0);
-        self.head_chains.retain(|chain| !chain.peer_pool.is_empty());
+        self.head_chains
            .retain(|_id, chain| chain.available_peers() > 0);
        let local_info = match PeerSyncInfo::from_chain(&self.beacon_chain) {
            Some(local) => local,
@ -533,28 +456,28 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        let beacon_chain = &self.beacon_chain;
        let log_ref = &self.log;
        // Remove chains that are out-dated and re-status their peers
-        self.finalized_chains.retain(|chain| {
+        self.finalized_chains.retain(|_id, chain| {
            if chain.target_head_slot <= local_finalized_slot
                || beacon_chain
                    .fork_choice
                    .read()
                    .contains_block(&chain.target_head_root)
            {
-                debug!(log_ref, "Purging out of finalized chain"; "start_epoch" => chain.start_epoch, "end_slot" => chain.target_head_slot);
+                debug!(log_ref, "Purging out of finalized chain"; &chain);
                chain.status_peers(network);
                false
            } else {
                true
            }
        });
-        self.head_chains.retain(|chain| {
+        self.head_chains.retain(|_id, chain| {
            if chain.target_head_slot <= local_finalized_slot
                || beacon_chain
                    .fork_choice
                    .read()
                    .contains_block(&chain.target_head_root)
            {
-                debug!(log_ref, "Purging out of date head chain"; "start_epoch" => chain.start_epoch, "end_slot" => chain.target_head_slot);
+                debug!(log_ref, "Purging out of date head chain"; &chain);
                chain.status_peers(network);
                false
            } else {
@ -563,63 +486,71 @@ impl<T: BeaconChainTypes> ChainCollection<T> {
        });
    }
-    /// Removes and returns a finalized chain from the collection.
+    /// Adds a peer to a chain with the given target, or creates a new syncing chain if it doesn't
-    pub fn remove_finalized_chain(&mut self, index: usize) -> SyncingChain<T> {
+    /// exits.
-        self.finalized_chains.swap_remove(index)
+    #[allow(clippy::too_many_arguments)]
-    }
+    pub fn add_peer_or_create_chain(
-
+        &mut self,
-    /// Removes and returns a head chain from the collection.
+        start_epoch: Epoch,
-    pub fn remove_head_chain(&mut self, index: usize) -> SyncingChain<T> {
+        target_head_root: Hash256,
-        self.head_chains.swap_remove(index)
+        target_head_slot: Slot,
-    }
+        peer: PeerId,
-
+        sync_type: RangeSyncType,
-    /// Removes a chain from either finalized or head chains based on the index. Using a request
+        beacon_processor_send: &mpsc::Sender<BeaconWorkEvent<T::EthSpec>>,
-    /// iterates of finalized chains before head chains. Thus an index that is greater than the
+        network: &mut SyncNetworkContext<T::EthSpec>,
-    /// finalized chain length, indicates a head chain.
+    ) {
-    ///
+        let id = SyncingChain::<T>::id(&target_head_root, &target_head_slot);
-    /// This will re-status the chains peers on removal. The index must exist.
+        let collection = if let RangeSyncType::Finalized = sync_type {
-    pub fn remove_chain(&mut self, network: &mut SyncNetworkContext<T::EthSpec>, index: usize) {
+            if let Some(chain) = self.head_chains.get(&id) {
-        let chain = if index >= self.finalized_chains.len() {
+                // sanity verification for chain duplication / purging issues
-            let index = index - self.finalized_chains.len();
+                crit!(self.log, "Adding known head chain as finalized chain"; chain);
-            let chain = self.head_chains.swap_remove(index);
+            }
-            chain.status_peers(network);
+            &mut self.finalized_chains
            chain
        } else {
-            let chain = self.finalized_chains.swap_remove(index);
+            if let Some(chain) = self.finalized_chains.get(&id) {
-            chain.status_peers(network);
+                // sanity verification for chain duplication / purging issues
-            chain
+                crit!(self.log, "Adding known finalized chain as head chain"; chain);
            }
            &mut self.head_chains
        };
-
+        match collection.entry(id) {
-        debug!(self.log, "Chain was removed"; "start_epoch" => chain.start_epoch, "end_slot" => chain.target_head_slot);
+            Entry::Occupied(mut entry) => {
-
+                let chain = entry.get_mut();
-        // update the state
+                debug!(self.log, "Adding peer to known chain"; "peer_id" => %peer, "sync_type" => ?sync_type, &chain);
-        self.update(network);
+                assert_eq!(chain.target_head_root, target_head_root);
                assert_eq!(chain.target_head_slot, target_head_slot);
                if let ProcessingResult::RemoveChain = chain.add_peer(network, peer) {
                    debug!(self.log, "Chain removed after adding peer"; "chain" => id);
                    entry.remove();
                }
            }
            Entry::Vacant(entry) => {
                let peer_rpr = peer.to_string();
                let new_chain = SyncingChain::new(
                    start_epoch,
                    target_head_slot,
                    target_head_root,
                    peer,
                    beacon_processor_send.clone(),
                    self.beacon_chain.clone(),
                    &self.log,
                );
                assert_eq!(new_chain.get_id(), id);
                debug!(self.log, "New chain added to sync"; "peer_id" => peer_rpr, "sync_type" => ?sync_type, &new_chain);
                entry.insert(new_chain);
            }
        }
    }
    /// Returns the index of finalized chain that is currently syncing. Returns `None` if no
    /// finalized chain is currently syncing.
-    fn finalized_syncing_index(&self) -> Option<usize> {
+    fn finalized_syncing_chain(&mut self) -> Option<(&ChainId, &mut SyncingChain<T>)> {
-        self.finalized_chains
+        self.finalized_chains.iter_mut().find_map(|(id, chain)| {
-            .iter()
+            if chain.state == ChainSyncingState::Syncing {
-            .enumerate()
+                Some((id, chain))
-            .find_map(|(index, chain)| {
+            } else {
-                if chain.state == ChainSyncingState::Syncing {
+                None
-                    Some(index)
+            }
                } else {
                    None
                }
            })
    }
    /// Returns a chain given the target head root and slot.
    fn get_chain<'a>(
        chain: &'a mut [SyncingChain<T>],
        target_head_root: Hash256,
        target_head_slot: Slot,
    ) -> Option<&'a mut SyncingChain<T>> {
        chain.iter_mut().find(|iter_chain| {
            iter_chain.target_head_root == target_head_root
                && iter_chain.target_head_slot == target_head_slot
        })
    }
 }
--- a/beacon_node/network/src/sync/range_sync/mod.rs
+++ b/beacon_node/network/src/sync/range_sync/mod.rs
@ -7,6 +7,6 @@ mod chain_collection;
 mod range;
 mod sync_type;
-pub use batch::Batch;
+pub use batch::BatchInfo;
-pub use chain::{ChainId, EPOCHS_PER_BATCH};
+pub use chain::{BatchId, ChainId, EPOCHS_PER_BATCH};
 pub use range::RangeSync;
--- a/beacon_node/network/src/sync/range_sync/range.rs
+++ b/beacon_node/network/src/sync/range_sync/range.rs
@ -39,7 +39,7 @@
 //!  Each chain is downloaded in batches of blocks. The batched blocks are processed sequentially
 //!  and further batches are requested as current blocks are being processed.
-use super::chain::{ChainId, ProcessingResult};
+use super::chain::ChainId;
 use super::chain_collection::{ChainCollection, RangeSyncState};
 use super::sync_type::RangeSyncType;
 use crate::beacon_processor::WorkEvent as BeaconWorkEvent;
@ -49,7 +49,7 @@ use crate::sync::PeerSyncInfo;
 use crate::sync::RequestId;
 use beacon_chain::{BeaconChain, BeaconChainTypes};
 use eth2_libp2p::{NetworkGlobals, PeerId};
-use slog::{debug, error, trace};
+use slog::{debug, error, trace, warn};
 use std::collections::HashSet;
 use std::sync::Arc;
 use tokio::sync::mpsc;
@ -121,21 +121,15 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        let local_info = match PeerSyncInfo::from_chain(&self.beacon_chain) {
            Some(local) => local,
            None => {
-                return error!(
+                return error!(self.log, "Failed to get peer sync info";
-                    self.log,
+                    "msg" => "likely due to head lock contention")
                    "Failed to get peer sync info";
                    "msg" => "likely due to head lock contention"
                )
            }
        };
-        // convenience variables
+        // convenience variable
        let remote_finalized_slot = remote_info
            .finalized_epoch
            .start_slot(T::EthSpec::slots_per_epoch());
        let local_finalized_slot = local_info
            .finalized_epoch
            .start_slot(T::EthSpec::slots_per_epoch());
        // NOTE: A peer that has been re-status'd may now exist in multiple finalized chains.
@ -146,7 +140,7 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        match RangeSyncType::new(&self.beacon_chain, &local_info, &remote_info) {
            RangeSyncType::Finalized => {
                // Finalized chain search
-                debug!(self.log, "Finalization sync peer joined"; "peer_id" => format!("{:?}", peer_id));
+                debug!(self.log, "Finalization sync peer joined"; "peer_id" => %peer_id);
                // remove the peer from the awaiting_head_peers list if it exists
                self.awaiting_head_peers.remove(&peer_id);
@ -154,37 +148,19 @@ impl<T: BeaconChainTypes> RangeSync<T> {
                // Note: We keep current head chains. These can continue syncing whilst we complete
                // this new finalized chain.
-                // If a finalized chain already exists that matches, add this peer to the chain's peer
+                self.chains.add_peer_or_create_chain(
-                // pool.
+                    local_info.finalized_epoch,
-                if let Some(chain) = self
+                    remote_info.finalized_root,
-                    .chains
+                    remote_finalized_slot,
-                    .get_finalized_mut(remote_info.finalized_root, remote_finalized_slot)
+                    peer_id,
-                {
+                    RangeSyncType::Finalized,
-                    debug!(self.log, "Finalized chain exists, adding peer"; "peer_id" => peer_id.to_string(), "target_root" => chain.target_head_root.to_string(), "targe_slot" => chain.target_head_slot);
+                    &self.beacon_processor_send,
                    network,
                );
-                    // add the peer to the chain's peer pool
+                self.chains.update(network);
-                    chain.add_peer(network, peer_id);
+                // update the global sync state
-
+                self.chains.update_sync_state(network);
                    // check if the new peer's addition will favour a new syncing chain.
                    self.chains.update(network);
                    // update the global sync state if necessary
                    self.chains.update_sync_state(network);
                } else {
                    // there is no finalized chain that matches this peer's last finalized target
                    // create a new finalized chain
                    debug!(self.log, "New finalized chain added to sync"; "peer_id" => format!("{:?}", peer_id), "start_slot" => local_finalized_slot, "end_slot" => remote_finalized_slot, "finalized_root" => format!("{}", remote_info.finalized_root));
                    self.chains.new_finalized_chain(
                        local_info.finalized_epoch,
                        remote_info.finalized_root,
                        remote_finalized_slot,
                        peer_id,
                        self.beacon_processor_send.clone(),
                    );
                    self.chains.update(network);
                    // update the global sync state
                    self.chains.update_sync_state(network);
                }
            }
            RangeSyncType::Head => {
                // This peer requires a head chain sync
@ -192,7 +168,7 @@ impl<T: BeaconChainTypes> RangeSync<T> {
                if self.chains.is_finalizing_sync() {
                    // If there are finalized chains to sync, finish these first, before syncing head
                    // chains. This allows us to re-sync all known peers
-                    trace!(self.log, "Waiting for finalized sync to complete"; "peer_id" => format!("{:?}", peer_id));
+                    trace!(self.log, "Waiting for finalized sync to complete"; "peer_id" => %peer_id);
                    // store the peer to re-status after all finalized chains complete
                    self.awaiting_head_peers.insert(peer_id);
                    return;
@ -203,31 +179,18 @@ impl<T: BeaconChainTypes> RangeSync<T> {
                // The new peer has the same finalized (earlier filters should prevent a peer with an
                // earlier finalized chain from reaching here).
                debug!(self.log, "New peer added for recent head sync"; "peer_id" => format!("{:?}", peer_id));
-                // search if there is a matching head chain, then add the peer to the chain
+                let start_epoch = std::cmp::min(local_info.head_slot, remote_finalized_slot)
-                if let Some(chain) = self
+                    .epoch(T::EthSpec::slots_per_epoch());
-                    .chains
+                self.chains.add_peer_or_create_chain(
-                    .get_head_mut(remote_info.head_root, remote_info.head_slot)
+                    start_epoch,
-                {
+                    remote_info.head_root,
-                    debug!(self.log, "Adding peer to the existing head chain peer pool"; "head_root" => format!("{}",remote_info.head_root), "head_slot" => remote_info.head_slot, "peer_id" => format!("{:?}", peer_id));
+                    remote_info.head_slot,
-
+                    peer_id,
-                    // add the peer to the head's pool
+                    RangeSyncType::Head,
-                    chain.add_peer(network, peer_id);
+                    &self.beacon_processor_send,
-                } else {
+                    network,
-                    // There are no other head chains that match this peer's status, create a new one, and
+                );
                    let start_epoch = std::cmp::min(local_info.head_slot, remote_finalized_slot)
                        .epoch(T::EthSpec::slots_per_epoch());
                    debug!(self.log, "Creating a new syncing head chain"; "head_root" => format!("{}",remote_info.head_root), "start_epoch" => start_epoch, "head_slot" => remote_info.head_slot, "peer_id" => format!("{:?}", peer_id));
                    self.chains.new_head_chain(
                        start_epoch,
                        remote_info.head_root,
                        remote_info.head_slot,
                        peer_id,
                        self.beacon_processor_send.clone(),
                    );
                }
                self.chains.update(network);
                self.chains.update_sync_state(network);
            }
@ -245,23 +208,27 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        request_id: RequestId,
        beacon_block: Option<SignedBeaconBlock<T::EthSpec>>,
    ) {
-        // Find the request. Most likely the first finalized chain (the syncing chain). If there
+        // get the chain and batch for which this response belongs
-        // are no finalized chains, then it will be a head chain. At most, there should only be
+        if let Some((chain_id, batch_id)) =
-        // `connected_peers` number of head chains, which should be relatively small and this
+            network.blocks_by_range_response(request_id, beacon_block.is_none())
-        // lookup should not be very expensive. However, we could add an extra index that maps the
+        {
-        // request id to index of the vector to avoid O(N) searches and O(N) hash lookups.
+            // check if this chunk removes the chain
-
+            match self.chains.call_by_id(chain_id, |chain| {
-        let id_not_found = self
+                chain.on_block_response(network, batch_id, peer_id, beacon_block)
-            .chains
+            }) {
-            .head_finalized_request(|chain| {
+                Ok((removed_chain, sync_type)) => {
-                chain.on_block_response(network, request_id, &beacon_block)
+                    if let Some(removed_chain) = removed_chain {
-            })
+                        debug!(self.log, "Chain removed after block response"; "sync_type" => ?sync_type, "chain_id" => chain_id);
-            .is_none();
+                        removed_chain.status_peers(network);
-        if id_not_found {
+                        // TODO: update & update_sync_state?
-            // The request didn't exist in any `SyncingChain`. Could have been an old request or
+                    }
-            // the chain was purged due to being out of date whilst a request was pending. Log
+                }
-            // and ignore.
+                Err(_) => {
-            debug!(self.log, "Range response without matching request"; "peer" => format!("{:?}", peer_id), "request_id" => request_id);
+                    debug!(self.log, "BlocksByRange response for removed chain"; "chain" => chain_id)
                }
            }
        } else {
            warn!(self.log, "Response/Error for non registered request"; "request_id" => request_id)
        }
    }
@ -269,76 +236,57 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        &mut self,
        network: &mut SyncNetworkContext<T::EthSpec>,
        chain_id: ChainId,
-        epoch: Epoch,
+        batch_id: Epoch,
        downloaded_blocks: Vec<SignedBeaconBlock<T::EthSpec>>,
        result: BatchProcessResult,
    ) {
-        // build an option for passing the downloaded_blocks to each chain
+        // check if this response removes the chain
-        let mut downloaded_blocks = Some(downloaded_blocks);
+        match self.chains.call_by_id(chain_id, |chain| {
-
+            chain.on_batch_process_result(network, batch_id, &result)
        match self.chains.finalized_request(|chain| {
            chain.on_batch_process_result(network, chain_id, epoch, &mut downloaded_blocks, &result)
        }) {
-            Some((index, ProcessingResult::RemoveChain)) => {
+            Ok((None, _sync_type)) => {
-                let chain = self.chains.remove_finalized_chain(index);
+                // Chain was found and not removed
-                debug!(self.log, "Finalized chain removed"; "start_epoch" => chain.start_epoch, "end_slot" => chain.target_head_slot);
+            }
-                // update the state of the collection
+            Ok((Some(removed_chain), sync_type)) => {
-                self.chains.update(network);
+                debug!(self.log, "Chain removed after processing result"; "chain" => chain_id, "sync_type" => ?sync_type);
-
+                // Chain ended, re-status its peers
-                // the chain is complete, re-status it's peers
+                removed_chain.status_peers(network);
-                chain.status_peers(network);
+                match sync_type {
-
+                    RangeSyncType::Finalized => {
-                // set the state to a head sync if there are no finalized chains, to inform the manager that we are awaiting a
+                        // update the state of the collection
-                // head chain.
+                        self.chains.update(network);
-                self.chains.set_head_sync();
+                        // set the state to a head sync if there are no finalized chains, to inform
-                // Update the global variables
+                        // the manager that we are awaiting a head chain.
-                self.chains.update_sync_state(network);
+                        self.chains.set_head_sync();
-
+                        // Update the global variables
-                // if there are no more finalized chains, re-status all known peers awaiting a head
+                        self.chains.update_sync_state(network);
-                // sync
+                        // if there are no more finalized chains, re-status all known peers
-                match self.chains.state() {
+                        // awaiting a head sync
-                    RangeSyncState::Idle | RangeSyncState::Head { .. } => {
+                        match self.chains.state() {
-                        for peer_id in self.awaiting_head_peers.drain() {
+                            RangeSyncState::Idle | RangeSyncState::Head { .. } => {
-                            network.status_peer(self.beacon_chain.clone(), peer_id);
+                                network.status_peers(
                                    self.beacon_chain.clone(),
                                    self.awaiting_head_peers.drain(),
                                );
                            }
                            RangeSyncState::Finalized { .. } => {} // Have more finalized chains to complete
                        }
                    }
-                    RangeSyncState::Finalized { .. } => {} // Have more finalized chains to complete
+                    RangeSyncType::Head => {
-                }
+                        // Remove non-syncing head chains and re-status the peers.  This removes a
-            }
+                        // build-up of potentially duplicate head chains. Any legitimate head
-            Some((_, ProcessingResult::KeepChain)) => {}
+                        // chains will be re-established
            None => {
                match self.chains.head_request(|chain| {
                    chain.on_batch_process_result(
                        network,
                        chain_id,
                        epoch,
                        &mut downloaded_blocks,
                        &result,
                    )
                }) {
                    Some((index, ProcessingResult::RemoveChain)) => {
                        let chain = self.chains.remove_head_chain(index);
                        debug!(self.log, "Head chain completed"; "start_epoch" => chain.start_epoch, "end_slot" => chain.target_head_slot);
                        // the chain is complete, re-status it's peers and remove it
                        chain.status_peers(network);
                        // Remove non-syncing head chains and re-status the peers
                        // This removes a build-up of potentially duplicate head chains. Any
                        // legitimate head chains will be re-established
                        self.chains.clear_head_chains(network);
                        // update the state of the collection
                        self.chains.update(network);
                        // update the global state and log any change
                        self.chains.update_sync_state(network);
                    }
                    Some((_, ProcessingResult::KeepChain)) => {}
                    None => {
                        // This can happen if a chain gets purged due to being out of date whilst a
                        // batch process is in progress.
                        debug!(self.log, "No chains match the block processing id"; "batch_epoch" => epoch, "chain_id" => chain_id);
                    }
                }
            }
            Err(_) => {
                debug!(self.log, "BlocksByRange response for removed chain"; "chain" => chain_id)
            }
        }
    }
@ -352,7 +300,7 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        // if the peer is in the awaiting head mapping, remove it
        self.awaiting_head_peers.remove(peer_id);
-        // remove the peer from any peer pool
+        // remove the peer from any peer pool, failing its batches
        self.remove_peer(network, peer_id);
        // update the state of the collection
@ -361,30 +309,17 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        self.chains.update_sync_state(network);
    }
-    /// When a peer gets removed, both the head and finalized chains need to be searched to check which pool the peer is in. The chain may also have a batch or batches awaiting
+    /// When a peer gets removed, both the head and finalized chains need to be searched to check
    /// which pool the peer is in. The chain may also have a batch or batches awaiting
    /// for this peer. If so we mark the batch as failed. The batch may then hit it's maximum
    /// retries. In this case, we need to remove the chain and re-status all the peers.
    fn remove_peer(&mut self, network: &mut SyncNetworkContext<T::EthSpec>, peer_id: &PeerId) {
-        for (index, result) in self.chains.head_finalized_request_all(|chain| {
+        for (removed_chain, sync_type) in self
-            if chain.peer_pool.remove(peer_id) {
+            .chains
-                // this chain contained the peer
+            .call_all(|chain| chain.remove_peer(peer_id, network))
-                while let Some(batch) = chain.pending_batches.remove_batch_by_peer(peer_id) {
+        {
-                    if let ProcessingResult::RemoveChain = chain.failed_batch(network, batch) {
+            debug!(self.log, "Chain removed after removing peer"; "sync_type" => ?sync_type, "chain" => removed_chain.get_id());
-                        // a single batch failed, remove the chain
+            // TODO: anything else to do?
                        return Some(ProcessingResult::RemoveChain);
                    }
                }
                // peer removed from chain, no batch failed
                Some(ProcessingResult::KeepChain)
            } else {
                None
            }
        }) {
            if result == ProcessingResult::RemoveChain {
                // the chain needed to be removed
                debug!(self.log, "Chain being removed due to failed batch");
                self.chains.remove_chain(network, index);
            }
        }
    }
@ -398,17 +333,25 @@ impl<T: BeaconChainTypes> RangeSync<T> {
        peer_id: PeerId,
        request_id: RequestId,
    ) {
-        // check that this request is pending
+        // get the chain and batch for which this response belongs
-        match self
+        if let Some((chain_id, batch_id)) = network.blocks_by_range_response(request_id, true) {
-            .chains
+            // check that this request is pending
-            .head_finalized_request(|chain| chain.inject_error(network, &peer_id, request_id))
+            match self.chains.call_by_id(chain_id, |chain| {
-        {
+                chain.inject_error(network, batch_id, peer_id)
-            Some((_, ProcessingResult::KeepChain)) => {} // error handled chain persists
+            }) {
-            Some((index, ProcessingResult::RemoveChain)) => {
+                Ok((removed_chain, sync_type)) => {
-                debug!(self.log, "Chain being removed due to RPC error");
+                    if let Some(removed_chain) = removed_chain {
-                self.chains.remove_chain(network, index)
+                        debug!(self.log, "Chain removed on rpc error"; "sync_type" => ?sync_type, "chain" => removed_chain.get_id());
                        removed_chain.status_peers(network);
                        // TODO: update & update_sync_state?
                    }
                }
                Err(_) => {
                    debug!(self.log, "BlocksByRange response for removed chain"; "chain" => chain_id)
                }
            }
-            None => {} // request wasn't in the finalized chains, check the head chains
+        } else {
            warn!(self.log, "Response/Error for non registered request"; "request_id" => request_id)
        }
    }
 }
--- a/beacon_node/network/src/sync/range_sync/sync_type.rs
+++ b/beacon_node/network/src/sync/range_sync/sync_type.rs
@ -6,6 +6,7 @@ use beacon_chain::{BeaconChain, BeaconChainTypes};
 use std::sync::Arc;
 /// The type of Range sync that should be done relative to our current state.
 #[derive(Debug)]
 pub enum RangeSyncType {
    /// A finalized chain sync should be started with this peer.
    Finalized,