lighthouse/beacon_node/lighthouse_network/src/rpc/handler.rs

1021 lines
46 KiB
Rust
Raw Normal View History

#![allow(clippy::type_complexity)]
#![allow(clippy::cognitive_complexity)]
use super::methods::{
GoodbyeReason, RPCCodedResponse, RPCResponseErrorCode, RequestId, ResponseTermination,
};
use super::outbound::OutboundRequestContainer;
use super::protocol::{InboundRequest, Protocol, RPCError, RPCProtocol};
use super::{RPCReceived, RPCSend};
use crate::rpc::outbound::{OutboundFramed, OutboundRequest};
use crate::rpc::protocol::InboundFramed;
2019-07-09 05:44:23 +00:00
use fnv::FnvHashMap;
use futures::prelude::*;
Update to tokio 1.1 (#2172) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. **Edit:** I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-02-10 23:29:49 +00:00
use futures::{Sink, SinkExt};
use libp2p::core::upgrade::{
InboundUpgrade, NegotiationError, OutboundUpgrade, ProtocolError, UpgradeError,
};
use libp2p::swarm::protocols_handler::{
2019-07-09 05:44:23 +00:00
KeepAlive, ProtocolsHandler, ProtocolsHandlerEvent, ProtocolsHandlerUpgrErr, SubstreamProtocol,
};
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
use libp2p::swarm::NegotiatedSubstream;
use slog::{crit, debug, trace, warn};
use smallvec::SmallVec;
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
use std::{
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
collections::{hash_map::Entry, VecDeque},
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
pin::Pin,
sync::Arc,
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
task::{Context, Poll},
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
time::{Duration, Instant},
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
};
use tokio::time::{sleep_until, Instant as TInstant, Sleep};
use tokio_util::time::{delay_queue, DelayQueue};
use types::{EthSpec, ForkContext};
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// The time (in seconds) before a substream that is awaiting a response from the user times out.
pub const RESPONSE_TIMEOUT: u64 = 10;
/// The number of times to retry an outbound upgrade in the case of IO errors.
const IO_ERROR_RETRIES: u8 = 3;
/// Maximum time given to the handler to perform shutdown operations.
const SHUTDOWN_TIMEOUT_SECS: u8 = 15;
/// Identifier of inbound and outbound substreams from the handler's perspective.
#[derive(Debug, Clone, Copy, Hash, Eq, PartialEq)]
pub struct SubstreamId(usize);
type InboundSubstream<TSpec> = InboundFramed<NegotiatedSubstream, TSpec>;
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
/// Events the handler emits to the behaviour.
type HandlerEvent<T> = Result<RPCReceived<T>, HandlerErr>;
/// An error encountered by the handler.
#[derive(Debug)]
pub enum HandlerErr {
/// An error occurred for this peer's request. This can occur during protocol negotiation,
/// message passing, or if the handler identifies that we are sending an error response to the peer.
Inbound {
/// Id of the peer's request for which an error occurred.
id: SubstreamId,
/// Information of the negotiated protocol.
proto: Protocol,
/// The error that occurred.
error: RPCError,
},
/// An error occurred for this request. Such error can occur during protocol negotiation,
/// message passing, or if we successfully received a response from the peer, but this response
/// indicates an error.
Outbound {
/// Application-given Id of the request for which an error occurred.
id: RequestId,
/// Information of the protocol.
proto: Protocol,
/// The error that occurred.
error: RPCError,
},
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Implementation of `ProtocolsHandler` for the RPC protocol.
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
pub struct RPCHandler<TSpec>
where
Initial work towards v0.2.0 (#924) * Remove ping protocol * Initial renaming of network services * Correct rebasing relative to latest master * Start updating types * Adds HashMapDelay struct to utils * Initial network restructure * Network restructure. Adds new types for v0.2.0 * Removes build artefacts * Shift validation to beacon chain * Temporarily remove gossip validation This is to be updated to match current optimisation efforts. * Adds AggregateAndProof * Begin rebuilding pubsub encoding/decoding * Signature hacking * Shift gossipsup decoding into eth2_libp2p * Existing EF tests passing with fake_crypto * Shifts block encoding/decoding into RPC * Delete outdated API spec * All release tests passing bar genesis state parsing * Update and test YamlConfig * Update to spec v0.10 compatible BLS * Updates to BLS EF tests * Add EF test for AggregateVerify And delete unused hash2curve tests for uncompressed points * Update EF tests to v0.10.1 * Use optional block root correctly in block proc * Use genesis fork in deposit domain. All tests pass * Fast aggregate verify test * Update REST API docs * Fix unused import * Bump spec tags to v0.10.1 * Add `seconds_per_eth1_block` to chainspec * Update to timestamp based eth1 voting scheme * Return None from `get_votes_to_consider` if block cache is empty * Handle overflows in `is_candidate_block` * Revert to failing tests * Fix eth1 data sets test * Choose default vote according to spec * Fix collect_valid_votes tests * Fix `get_votes_to_consider` to choose all eligible blocks * Uncomment winning_vote tests * Add comments; remove unused code * Reduce seconds_per_eth1_block for simulation * Addressed review comments * Add test for default vote case * Fix logs * Remove unused functions * Meter default eth1 votes * Fix comments * Progress on attestation service * Address review comments; remove unused dependency * Initial work on removing libp2p lock * Add LRU caches to store (rollup) * Update attestation validation for DB changes (WIP) * Initial version of should_forward_block * Scaffold * Progress on attestation validation Also, consolidate prod+testing slot clocks so that they share much of the same implementation and can both handle sub-slot time changes. * Removes lock from libp2p service * Completed network lock removal * Finish(?) attestation processing * Correct network termination future * Add slot check to block check * Correct fmt issues * Remove Drop implementation for network service * Add first attempt at attestation proc. re-write * Add version 2 of attestation processing * Minor fixes * Add validator pubkey cache * Make get_indexed_attestation take a committee * Link signature processing into new attn verification * First working version * Ensure pubkey cache is updated * Add more metrics, slight optimizations * Clone committee cache during attestation processing * Update shuffling cache during block processing * Remove old commented-out code * Fix shuffling cache insert bug * Used indexed attestation in fork choice * Restructure attn processing, add metrics * Add more detailed metrics * Tidy, fix failing tests * Fix failing tests, tidy * Address reviewers suggestions * Disable/delete two outdated tests * Modification of validator for subscriptions * Add slot signing to validator client * Further progress on validation subscription * Adds necessary validator subscription functionality * Add new Pubkeys struct to signature_sets * Refactor with functional approach * Update beacon chain * Clean up validator <-> beacon node http types * Add aggregator status to ValidatorDuty * Impl Clone for manual slot clock * Fix minor errors * Further progress validator client subscription * Initial subscription and aggregation handling * Remove decompressed member from pubkey bytes * Progress to modifying val client for attestation aggregation * First draft of validator client upgrade for aggregate attestations * Add hashmap for indices lookup * Add state cache, remove store cache * Only build the head committee cache * Removes lock on a network channel * Partially implement beacon node subscription http api * Correct compilation issues * Change `get_attesting_indices` to use Vec * Fix failing test * Partial implementation of timer * Adds timer, removes exit_future, http api to op pool * Partial multiple aggregate attestation handling * Permits bulk messages accross gossipsub network channel * Correct compile issues * Improve gosispsub messaging and correct rest api helpers * Added global gossipsub subscriptions * Update validator subscriptions data structs * Tidy * Re-structure validator subscriptions * Initial handling of subscriptions * Re-structure network service * Add pubkey cache persistence file * Add more comments * Integrate persistence file into builder * Add pubkey cache tests * Add HashSetDelay and introduce into attestation service * Handles validator subscriptions * Add data_dir to beacon chain builder * Remove Option in pubkey cache persistence file * Ensure consistency between datadir/data_dir * Fix failing network test * Peer subnet discovery gets queued for future subscriptions * Reorganise attestation service functions * Initial wiring of attestation service * First draft of attestation service timing logic * Correct minor typos * Tidy * Fix todos * Improve tests * Add PeerInfo to connected peers mapping * Fix compile error * Fix compile error from merge * Split up block processing metrics * Tidy * Refactor get_pubkey_from_state * Remove commented-out code * Rename state_cache -> checkpoint_cache * Rename Checkpoint -> Snapshot * Tidy, add comments * Tidy up find_head function * Change some checkpoint -> snapshot * Add tests * Expose max_len * Remove dead code * Tidy * Fix bug * Add sync-speed metric * Add first attempt at VerifiableBlock * Start integrating into beacon chain * Integrate VerifiableBlock * Rename VerifableBlock -> PartialBlockVerification * Add start of typed methods * Add progress * Add further progress * Rename structs * Add full block verification to block_processing.rs * Further beacon chain integration * Update checks for gossip * Add todo * Start adding segement verification * Add passing chain segement test * Initial integration with batch sync * Minor changes * Tidy, add more error checking * Start adding chain_segment tests * Finish invalid signature tests * Include single and gossip verified blocks in tests * Add gossip verification tests * Start adding docs * Finish adding comments to block_processing.rs * Rename block_processing.rs -> block_verification * Start removing old block processing code * Fixes beacon_chain compilation * Fix project-wide compile errors * Remove old code * Correct code to pass all tests * Fix bug with beacon proposer index * Fix shim for BlockProcessingError * Only process one epoch at a time * Fix loop in chain segment processing * Correct tests from master merge * Add caching for state.eth1_data_votes * Add BeaconChain::validator_pubkey * Revert "Add caching for state.eth1_data_votes" This reverts commit cd73dcd6434fb8d8e6bf30c5356355598ea7b78e. Co-authored-by: Grant Wuerker <gwuerker@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-03-17 06:24:44 +00:00
TSpec: EthSpec,
{
/// The upgrade for inbound substreams.
listen_protocol: SubstreamProtocol<RPCProtocol<TSpec>, ()>,
/// Queue of events to produce in `poll()`.
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
events_out: SmallVec<[HandlerEvent<TSpec>; 4]>,
/// Queue of outbound substreams to open.
dial_queue: SmallVec<[(RequestId, OutboundRequest<TSpec>); 4]>,
/// Current number of concurrent outbound substreams being opened.
dial_negotiated: u32,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Current inbound substreams awaiting processing.
inbound_substreams: FnvHashMap<SubstreamId, InboundInfo<TSpec>>,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Inbound substream `DelayQueue` which keeps track of when an inbound substream will timeout.
inbound_substreams_delay: DelayQueue<SubstreamId>,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Map of outbound substreams that need to be driven to completion.
outbound_substreams: FnvHashMap<SubstreamId, OutboundInfo<TSpec>>,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Inbound substream `DelayQueue` which keeps track of when an inbound substream will timeout.
outbound_substreams_delay: DelayQueue<SubstreamId>,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Sequential ID for waiting substreams. For inbound substreams, this is also the inbound request ID.
current_inbound_substream_id: SubstreamId,
/// Sequential ID for outbound substreams.
current_outbound_substream_id: SubstreamId,
/// Maximum number of concurrent outbound substreams being opened. Value is never modified.
max_dial_negotiated: u32,
/// State of the handler.
state: HandlerState,
/// Try to negotiate the outbound upgrade a few times if there is an IO error before reporting the request as failed.
/// This keeps track of the number of attempts.
outbound_io_error_retries: u8,
/// Fork specific info.
fork_context: Arc<ForkContext>,
/// Waker, to be sure the handler gets polled when needed.
waker: Option<std::task::Waker>,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Logger for handling RPC streams
log: slog::Logger,
}
enum HandlerState {
/// The handler is active. All messages are sent and received.
Active,
/// The handler is shutting_down.
///
/// While in this state the handler rejects new requests but tries to finish existing ones.
/// Once the timer expires, all messages are killed.
Update to tokio 1.1 (#2172) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. **Edit:** I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-02-10 23:29:49 +00:00
ShuttingDown(Box<Sleep>),
/// The handler is deactivated. A goodbye has been sent and no more messages are sent or
/// received.
Deactivated,
}
/// Contains the information the handler keeps on established inbound substreams.
struct InboundInfo<TSpec: EthSpec> {
/// State of the substream.
state: InboundState<TSpec>,
/// Responses queued for sending.
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
pending_items: VecDeque<RPCCodedResponse<TSpec>>,
/// Protocol of the original request we received from the peer.
protocol: Protocol,
/// Responses that the peer is still expecting from us.
remaining_chunks: u64,
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
/// Useful to timing how long each request took to process. Currently only used by
/// BlocksByRange.
request_start_time: Instant,
/// Key to keep track of the substream's timeout via `self.inbound_substreams_delay`.
delay_key: Option<delay_queue::Key>,
}
/// Contains the information the handler keeps on established outbound substreams.
struct OutboundInfo<TSpec: EthSpec> {
/// State of the substream.
state: OutboundSubstreamState<TSpec>,
/// Key to keep track of the substream's timeout via `self.outbound_substreams_delay`.
delay_key: delay_queue::Key,
/// Info over the protocol this substream is handling.
proto: Protocol,
/// Number of chunks to be seen from the peer's response.
remaining_chunks: Option<u64>,
/// `RequestId` as given by the application that sent the request.
req_id: RequestId,
}
/// State of an inbound substream connection.
enum InboundState<TSpec: EthSpec> {
/// The underlying substream is not being used.
Idle(InboundSubstream<TSpec>),
/// The underlying substream is processing responses.
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
/// The return value of the future is (substream, stream_was_closed). The stream_was_closed boolean
/// indicates if the stream was closed due to an error or successfully completing a response.
Busy(Pin<Box<dyn Future<Output = Result<(InboundSubstream<TSpec>, bool), RPCError>> + Send>>),
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Temporary state during processing
Poisoned,
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
/// State of an outbound substream. Either waiting for a response, or in the process of sending.
pub enum OutboundSubstreamState<TSpec: EthSpec> {
/// A request has been sent, and we are awaiting a response. This future is driven in the
/// handler because GOODBYE requests can be handled and responses dropped instantly.
RequestPendingResponse {
/// The framed negotiated substream.
substream: Box<OutboundFramed<NegotiatedSubstream, TSpec>>,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Keeps track of the actual request sent.
request: OutboundRequest<TSpec>,
2019-07-09 05:44:23 +00:00
},
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Closing an outbound substream>
Closing(Box<OutboundFramed<NegotiatedSubstream, TSpec>>),
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
/// Temporary state during processing
Poisoned,
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
impl<TSpec> RPCHandler<TSpec>
where
Initial work towards v0.2.0 (#924) * Remove ping protocol * Initial renaming of network services * Correct rebasing relative to latest master * Start updating types * Adds HashMapDelay struct to utils * Initial network restructure * Network restructure. Adds new types for v0.2.0 * Removes build artefacts * Shift validation to beacon chain * Temporarily remove gossip validation This is to be updated to match current optimisation efforts. * Adds AggregateAndProof * Begin rebuilding pubsub encoding/decoding * Signature hacking * Shift gossipsup decoding into eth2_libp2p * Existing EF tests passing with fake_crypto * Shifts block encoding/decoding into RPC * Delete outdated API spec * All release tests passing bar genesis state parsing * Update and test YamlConfig * Update to spec v0.10 compatible BLS * Updates to BLS EF tests * Add EF test for AggregateVerify And delete unused hash2curve tests for uncompressed points * Update EF tests to v0.10.1 * Use optional block root correctly in block proc * Use genesis fork in deposit domain. All tests pass * Fast aggregate verify test * Update REST API docs * Fix unused import * Bump spec tags to v0.10.1 * Add `seconds_per_eth1_block` to chainspec * Update to timestamp based eth1 voting scheme * Return None from `get_votes_to_consider` if block cache is empty * Handle overflows in `is_candidate_block` * Revert to failing tests * Fix eth1 data sets test * Choose default vote according to spec * Fix collect_valid_votes tests * Fix `get_votes_to_consider` to choose all eligible blocks * Uncomment winning_vote tests * Add comments; remove unused code * Reduce seconds_per_eth1_block for simulation * Addressed review comments * Add test for default vote case * Fix logs * Remove unused functions * Meter default eth1 votes * Fix comments * Progress on attestation service * Address review comments; remove unused dependency * Initial work on removing libp2p lock * Add LRU caches to store (rollup) * Update attestation validation for DB changes (WIP) * Initial version of should_forward_block * Scaffold * Progress on attestation validation Also, consolidate prod+testing slot clocks so that they share much of the same implementation and can both handle sub-slot time changes. * Removes lock from libp2p service * Completed network lock removal * Finish(?) attestation processing * Correct network termination future * Add slot check to block check * Correct fmt issues * Remove Drop implementation for network service * Add first attempt at attestation proc. re-write * Add version 2 of attestation processing * Minor fixes * Add validator pubkey cache * Make get_indexed_attestation take a committee * Link signature processing into new attn verification * First working version * Ensure pubkey cache is updated * Add more metrics, slight optimizations * Clone committee cache during attestation processing * Update shuffling cache during block processing * Remove old commented-out code * Fix shuffling cache insert bug * Used indexed attestation in fork choice * Restructure attn processing, add metrics * Add more detailed metrics * Tidy, fix failing tests * Fix failing tests, tidy * Address reviewers suggestions * Disable/delete two outdated tests * Modification of validator for subscriptions * Add slot signing to validator client * Further progress on validation subscription * Adds necessary validator subscription functionality * Add new Pubkeys struct to signature_sets * Refactor with functional approach * Update beacon chain * Clean up validator <-> beacon node http types * Add aggregator status to ValidatorDuty * Impl Clone for manual slot clock * Fix minor errors * Further progress validator client subscription * Initial subscription and aggregation handling * Remove decompressed member from pubkey bytes * Progress to modifying val client for attestation aggregation * First draft of validator client upgrade for aggregate attestations * Add hashmap for indices lookup * Add state cache, remove store cache * Only build the head committee cache * Removes lock on a network channel * Partially implement beacon node subscription http api * Correct compilation issues * Change `get_attesting_indices` to use Vec * Fix failing test * Partial implementation of timer * Adds timer, removes exit_future, http api to op pool * Partial multiple aggregate attestation handling * Permits bulk messages accross gossipsub network channel * Correct compile issues * Improve gosispsub messaging and correct rest api helpers * Added global gossipsub subscriptions * Update validator subscriptions data structs * Tidy * Re-structure validator subscriptions * Initial handling of subscriptions * Re-structure network service * Add pubkey cache persistence file * Add more comments * Integrate persistence file into builder * Add pubkey cache tests * Add HashSetDelay and introduce into attestation service * Handles validator subscriptions * Add data_dir to beacon chain builder * Remove Option in pubkey cache persistence file * Ensure consistency between datadir/data_dir * Fix failing network test * Peer subnet discovery gets queued for future subscriptions * Reorganise attestation service functions * Initial wiring of attestation service * First draft of attestation service timing logic * Correct minor typos * Tidy * Fix todos * Improve tests * Add PeerInfo to connected peers mapping * Fix compile error * Fix compile error from merge * Split up block processing metrics * Tidy * Refactor get_pubkey_from_state * Remove commented-out code * Rename state_cache -> checkpoint_cache * Rename Checkpoint -> Snapshot * Tidy, add comments * Tidy up find_head function * Change some checkpoint -> snapshot * Add tests * Expose max_len * Remove dead code * Tidy * Fix bug * Add sync-speed metric * Add first attempt at VerifiableBlock * Start integrating into beacon chain * Integrate VerifiableBlock * Rename VerifableBlock -> PartialBlockVerification * Add start of typed methods * Add progress * Add further progress * Rename structs * Add full block verification to block_processing.rs * Further beacon chain integration * Update checks for gossip * Add todo * Start adding segement verification * Add passing chain segement test * Initial integration with batch sync * Minor changes * Tidy, add more error checking * Start adding chain_segment tests * Finish invalid signature tests * Include single and gossip verified blocks in tests * Add gossip verification tests * Start adding docs * Finish adding comments to block_processing.rs * Rename block_processing.rs -> block_verification * Start removing old block processing code * Fixes beacon_chain compilation * Fix project-wide compile errors * Remove old code * Correct code to pass all tests * Fix bug with beacon proposer index * Fix shim for BlockProcessingError * Only process one epoch at a time * Fix loop in chain segment processing * Correct tests from master merge * Add caching for state.eth1_data_votes * Add BeaconChain::validator_pubkey * Revert "Add caching for state.eth1_data_votes" This reverts commit cd73dcd6434fb8d8e6bf30c5356355598ea7b78e. Co-authored-by: Grant Wuerker <gwuerker@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-03-17 06:24:44 +00:00
TSpec: EthSpec,
{
pub fn new(
listen_protocol: SubstreamProtocol<RPCProtocol<TSpec>, ()>,
fork_context: Arc<ForkContext>,
log: &slog::Logger,
) -> Self {
RPCHandler {
listen_protocol,
events_out: SmallVec::new(),
dial_queue: SmallVec::new(),
dial_negotiated: 0,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
inbound_substreams: FnvHashMap::default(),
outbound_substreams: FnvHashMap::default(),
inbound_substreams_delay: DelayQueue::new(),
outbound_substreams_delay: DelayQueue::new(),
current_inbound_substream_id: SubstreamId(0),
current_outbound_substream_id: SubstreamId(0),
state: HandlerState::Active,
max_dial_negotiated: 8,
outbound_io_error_retries: 0,
fork_context,
waker: None,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
log: log.clone(),
}
}
/// Initiates the handler's shutdown process, sending an optional Goodbye message to the
/// peer.
fn shutdown(&mut self, goodbye_reason: Option<GoodbyeReason>) {
if matches!(self.state, HandlerState::Active) {
if !self.dial_queue.is_empty() {
debug!(self.log, "Starting handler shutdown"; "unsent_queued_requests" => self.dial_queue.len());
}
// We now drive to completion communications already dialed/established
while let Some((id, req)) = self.dial_queue.pop() {
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Outbound {
error: RPCError::HandlerRejected,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto: req.protocol(),
id,
}));
}
// Queue our goodbye message.
if let Some(reason) = goodbye_reason {
self.dial_queue
.push((RequestId::Router, OutboundRequest::Goodbye(reason)));
}
Update to tokio 1.1 (#2172) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. **Edit:** I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-02-10 23:29:49 +00:00
self.state = HandlerState::ShuttingDown(Box::new(sleep_until(
TInstant::now() + Duration::from_secs(SHUTDOWN_TIMEOUT_SECS as u64),
Update to tokio 1.1 (#2172) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. **Edit:** I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-02-10 23:29:49 +00:00
)));
}
}
/// Opens an outbound substream with a request.
fn send_request(&mut self, id: RequestId, req: OutboundRequest<TSpec>) {
match self.state {
HandlerState::Active => {
self.dial_queue.push((id, req));
}
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
_ => self.events_out.push(Err(HandlerErr::Outbound {
error: RPCError::HandlerRejected,
proto: req.protocol(),
id,
})),
}
}
/// Sends a response to a peer's request.
// NOTE: If the substream has closed due to inactivity, or the substream is in the
// wrong state a response will fail silently.
fn send_response(&mut self, inbound_id: SubstreamId, response: RPCCodedResponse<TSpec>) {
// check if the stream matching the response still exists
let inbound_info = if let Some(info) = self.inbound_substreams.get_mut(&inbound_id) {
info
} else {
if !matches!(response, RPCCodedResponse::StreamTermination(..)) {
// the stream is closed after sending the expected number of responses
trace!(self.log, "Inbound stream has expired, response not sent";
"response" => %response, "id" => inbound_id);
}
return;
};
// If the response we are sending is an error, report back for handling
if let RPCCodedResponse::Error(ref code, ref reason) = response {
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Inbound {
error: RPCError::ErrorResponse(*code, reason.to_string()),
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto: inbound_info.protocol,
id: inbound_id,
}));
}
if matches!(self.state, HandlerState::Deactivated) {
// we no longer send responses after the handler is deactivated
debug!(self.log, "Response not sent. Deactivated handler";
"response" => %response, "id" => inbound_id);
return;
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
inbound_info.pending_items.push_back(response);
}
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
impl<TSpec> ProtocolsHandler for RPCHandler<TSpec>
2019-07-09 05:44:23 +00:00
where
Initial work towards v0.2.0 (#924) * Remove ping protocol * Initial renaming of network services * Correct rebasing relative to latest master * Start updating types * Adds HashMapDelay struct to utils * Initial network restructure * Network restructure. Adds new types for v0.2.0 * Removes build artefacts * Shift validation to beacon chain * Temporarily remove gossip validation This is to be updated to match current optimisation efforts. * Adds AggregateAndProof * Begin rebuilding pubsub encoding/decoding * Signature hacking * Shift gossipsup decoding into eth2_libp2p * Existing EF tests passing with fake_crypto * Shifts block encoding/decoding into RPC * Delete outdated API spec * All release tests passing bar genesis state parsing * Update and test YamlConfig * Update to spec v0.10 compatible BLS * Updates to BLS EF tests * Add EF test for AggregateVerify And delete unused hash2curve tests for uncompressed points * Update EF tests to v0.10.1 * Use optional block root correctly in block proc * Use genesis fork in deposit domain. All tests pass * Fast aggregate verify test * Update REST API docs * Fix unused import * Bump spec tags to v0.10.1 * Add `seconds_per_eth1_block` to chainspec * Update to timestamp based eth1 voting scheme * Return None from `get_votes_to_consider` if block cache is empty * Handle overflows in `is_candidate_block` * Revert to failing tests * Fix eth1 data sets test * Choose default vote according to spec * Fix collect_valid_votes tests * Fix `get_votes_to_consider` to choose all eligible blocks * Uncomment winning_vote tests * Add comments; remove unused code * Reduce seconds_per_eth1_block for simulation * Addressed review comments * Add test for default vote case * Fix logs * Remove unused functions * Meter default eth1 votes * Fix comments * Progress on attestation service * Address review comments; remove unused dependency * Initial work on removing libp2p lock * Add LRU caches to store (rollup) * Update attestation validation for DB changes (WIP) * Initial version of should_forward_block * Scaffold * Progress on attestation validation Also, consolidate prod+testing slot clocks so that they share much of the same implementation and can both handle sub-slot time changes. * Removes lock from libp2p service * Completed network lock removal * Finish(?) attestation processing * Correct network termination future * Add slot check to block check * Correct fmt issues * Remove Drop implementation for network service * Add first attempt at attestation proc. re-write * Add version 2 of attestation processing * Minor fixes * Add validator pubkey cache * Make get_indexed_attestation take a committee * Link signature processing into new attn verification * First working version * Ensure pubkey cache is updated * Add more metrics, slight optimizations * Clone committee cache during attestation processing * Update shuffling cache during block processing * Remove old commented-out code * Fix shuffling cache insert bug * Used indexed attestation in fork choice * Restructure attn processing, add metrics * Add more detailed metrics * Tidy, fix failing tests * Fix failing tests, tidy * Address reviewers suggestions * Disable/delete two outdated tests * Modification of validator for subscriptions * Add slot signing to validator client * Further progress on validation subscription * Adds necessary validator subscription functionality * Add new Pubkeys struct to signature_sets * Refactor with functional approach * Update beacon chain * Clean up validator <-> beacon node http types * Add aggregator status to ValidatorDuty * Impl Clone for manual slot clock * Fix minor errors * Further progress validator client subscription * Initial subscription and aggregation handling * Remove decompressed member from pubkey bytes * Progress to modifying val client for attestation aggregation * First draft of validator client upgrade for aggregate attestations * Add hashmap for indices lookup * Add state cache, remove store cache * Only build the head committee cache * Removes lock on a network channel * Partially implement beacon node subscription http api * Correct compilation issues * Change `get_attesting_indices` to use Vec * Fix failing test * Partial implementation of timer * Adds timer, removes exit_future, http api to op pool * Partial multiple aggregate attestation handling * Permits bulk messages accross gossipsub network channel * Correct compile issues * Improve gosispsub messaging and correct rest api helpers * Added global gossipsub subscriptions * Update validator subscriptions data structs * Tidy * Re-structure validator subscriptions * Initial handling of subscriptions * Re-structure network service * Add pubkey cache persistence file * Add more comments * Integrate persistence file into builder * Add pubkey cache tests * Add HashSetDelay and introduce into attestation service * Handles validator subscriptions * Add data_dir to beacon chain builder * Remove Option in pubkey cache persistence file * Ensure consistency between datadir/data_dir * Fix failing network test * Peer subnet discovery gets queued for future subscriptions * Reorganise attestation service functions * Initial wiring of attestation service * First draft of attestation service timing logic * Correct minor typos * Tidy * Fix todos * Improve tests * Add PeerInfo to connected peers mapping * Fix compile error * Fix compile error from merge * Split up block processing metrics * Tidy * Refactor get_pubkey_from_state * Remove commented-out code * Rename state_cache -> checkpoint_cache * Rename Checkpoint -> Snapshot * Tidy, add comments * Tidy up find_head function * Change some checkpoint -> snapshot * Add tests * Expose max_len * Remove dead code * Tidy * Fix bug * Add sync-speed metric * Add first attempt at VerifiableBlock * Start integrating into beacon chain * Integrate VerifiableBlock * Rename VerifableBlock -> PartialBlockVerification * Add start of typed methods * Add progress * Add further progress * Rename structs * Add full block verification to block_processing.rs * Further beacon chain integration * Update checks for gossip * Add todo * Start adding segement verification * Add passing chain segement test * Initial integration with batch sync * Minor changes * Tidy, add more error checking * Start adding chain_segment tests * Finish invalid signature tests * Include single and gossip verified blocks in tests * Add gossip verification tests * Start adding docs * Finish adding comments to block_processing.rs * Rename block_processing.rs -> block_verification * Start removing old block processing code * Fixes beacon_chain compilation * Fix project-wide compile errors * Remove old code * Correct code to pass all tests * Fix bug with beacon proposer index * Fix shim for BlockProcessingError * Only process one epoch at a time * Fix loop in chain segment processing * Correct tests from master merge * Add caching for state.eth1_data_votes * Add BeaconChain::validator_pubkey * Revert "Add caching for state.eth1_data_votes" This reverts commit cd73dcd6434fb8d8e6bf30c5356355598ea7b78e. Co-authored-by: Grant Wuerker <gwuerker@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-03-17 06:24:44 +00:00
TSpec: EthSpec,
{
type InEvent = RPCSend<TSpec>;
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
type OutEvent = HandlerEvent<TSpec>;
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
type Error = RPCError;
Initial work towards v0.2.0 (#924) * Remove ping protocol * Initial renaming of network services * Correct rebasing relative to latest master * Start updating types * Adds HashMapDelay struct to utils * Initial network restructure * Network restructure. Adds new types for v0.2.0 * Removes build artefacts * Shift validation to beacon chain * Temporarily remove gossip validation This is to be updated to match current optimisation efforts. * Adds AggregateAndProof * Begin rebuilding pubsub encoding/decoding * Signature hacking * Shift gossipsup decoding into eth2_libp2p * Existing EF tests passing with fake_crypto * Shifts block encoding/decoding into RPC * Delete outdated API spec * All release tests passing bar genesis state parsing * Update and test YamlConfig * Update to spec v0.10 compatible BLS * Updates to BLS EF tests * Add EF test for AggregateVerify And delete unused hash2curve tests for uncompressed points * Update EF tests to v0.10.1 * Use optional block root correctly in block proc * Use genesis fork in deposit domain. All tests pass * Fast aggregate verify test * Update REST API docs * Fix unused import * Bump spec tags to v0.10.1 * Add `seconds_per_eth1_block` to chainspec * Update to timestamp based eth1 voting scheme * Return None from `get_votes_to_consider` if block cache is empty * Handle overflows in `is_candidate_block` * Revert to failing tests * Fix eth1 data sets test * Choose default vote according to spec * Fix collect_valid_votes tests * Fix `get_votes_to_consider` to choose all eligible blocks * Uncomment winning_vote tests * Add comments; remove unused code * Reduce seconds_per_eth1_block for simulation * Addressed review comments * Add test for default vote case * Fix logs * Remove unused functions * Meter default eth1 votes * Fix comments * Progress on attestation service * Address review comments; remove unused dependency * Initial work on removing libp2p lock * Add LRU caches to store (rollup) * Update attestation validation for DB changes (WIP) * Initial version of should_forward_block * Scaffold * Progress on attestation validation Also, consolidate prod+testing slot clocks so that they share much of the same implementation and can both handle sub-slot time changes. * Removes lock from libp2p service * Completed network lock removal * Finish(?) attestation processing * Correct network termination future * Add slot check to block check * Correct fmt issues * Remove Drop implementation for network service * Add first attempt at attestation proc. re-write * Add version 2 of attestation processing * Minor fixes * Add validator pubkey cache * Make get_indexed_attestation take a committee * Link signature processing into new attn verification * First working version * Ensure pubkey cache is updated * Add more metrics, slight optimizations * Clone committee cache during attestation processing * Update shuffling cache during block processing * Remove old commented-out code * Fix shuffling cache insert bug * Used indexed attestation in fork choice * Restructure attn processing, add metrics * Add more detailed metrics * Tidy, fix failing tests * Fix failing tests, tidy * Address reviewers suggestions * Disable/delete two outdated tests * Modification of validator for subscriptions * Add slot signing to validator client * Further progress on validation subscription * Adds necessary validator subscription functionality * Add new Pubkeys struct to signature_sets * Refactor with functional approach * Update beacon chain * Clean up validator <-> beacon node http types * Add aggregator status to ValidatorDuty * Impl Clone for manual slot clock * Fix minor errors * Further progress validator client subscription * Initial subscription and aggregation handling * Remove decompressed member from pubkey bytes * Progress to modifying val client for attestation aggregation * First draft of validator client upgrade for aggregate attestations * Add hashmap for indices lookup * Add state cache, remove store cache * Only build the head committee cache * Removes lock on a network channel * Partially implement beacon node subscription http api * Correct compilation issues * Change `get_attesting_indices` to use Vec * Fix failing test * Partial implementation of timer * Adds timer, removes exit_future, http api to op pool * Partial multiple aggregate attestation handling * Permits bulk messages accross gossipsub network channel * Correct compile issues * Improve gosispsub messaging and correct rest api helpers * Added global gossipsub subscriptions * Update validator subscriptions data structs * Tidy * Re-structure validator subscriptions * Initial handling of subscriptions * Re-structure network service * Add pubkey cache persistence file * Add more comments * Integrate persistence file into builder * Add pubkey cache tests * Add HashSetDelay and introduce into attestation service * Handles validator subscriptions * Add data_dir to beacon chain builder * Remove Option in pubkey cache persistence file * Ensure consistency between datadir/data_dir * Fix failing network test * Peer subnet discovery gets queued for future subscriptions * Reorganise attestation service functions * Initial wiring of attestation service * First draft of attestation service timing logic * Correct minor typos * Tidy * Fix todos * Improve tests * Add PeerInfo to connected peers mapping * Fix compile error * Fix compile error from merge * Split up block processing metrics * Tidy * Refactor get_pubkey_from_state * Remove commented-out code * Rename state_cache -> checkpoint_cache * Rename Checkpoint -> Snapshot * Tidy, add comments * Tidy up find_head function * Change some checkpoint -> snapshot * Add tests * Expose max_len * Remove dead code * Tidy * Fix bug * Add sync-speed metric * Add first attempt at VerifiableBlock * Start integrating into beacon chain * Integrate VerifiableBlock * Rename VerifableBlock -> PartialBlockVerification * Add start of typed methods * Add progress * Add further progress * Rename structs * Add full block verification to block_processing.rs * Further beacon chain integration * Update checks for gossip * Add todo * Start adding segement verification * Add passing chain segement test * Initial integration with batch sync * Minor changes * Tidy, add more error checking * Start adding chain_segment tests * Finish invalid signature tests * Include single and gossip verified blocks in tests * Add gossip verification tests * Start adding docs * Finish adding comments to block_processing.rs * Rename block_processing.rs -> block_verification * Start removing old block processing code * Fixes beacon_chain compilation * Fix project-wide compile errors * Remove old code * Correct code to pass all tests * Fix bug with beacon proposer index * Fix shim for BlockProcessingError * Only process one epoch at a time * Fix loop in chain segment processing * Correct tests from master merge * Add caching for state.eth1_data_votes * Add BeaconChain::validator_pubkey * Revert "Add caching for state.eth1_data_votes" This reverts commit cd73dcd6434fb8d8e6bf30c5356355598ea7b78e. Co-authored-by: Grant Wuerker <gwuerker@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-03-17 06:24:44 +00:00
type InboundProtocol = RPCProtocol<TSpec>;
type OutboundProtocol = OutboundRequestContainer<TSpec>;
type OutboundOpenInfo = (RequestId, OutboundRequest<TSpec>); // Keep track of the id and the request
type InboundOpenInfo = ();
fn listen_protocol(&self) -> SubstreamProtocol<Self::InboundProtocol, ()> {
self.listen_protocol.clone()
}
fn inject_fully_negotiated_inbound(
&mut self,
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
substream: <Self::InboundProtocol as InboundUpgrade<NegotiatedSubstream>>::Output,
_info: Self::InboundOpenInfo,
) {
// only accept new peer requests when active
if !matches!(self.state, HandlerState::Active) {
2019-07-09 05:44:23 +00:00
return;
}
let (req, substream) = substream;
let expected_responses = req.expected_responses();
// store requests that expect responses
if expected_responses > 0 {
// Store the stream and tag the output.
let delay_key = self.inbound_substreams_delay.insert(
self.current_inbound_substream_id,
Duration::from_secs(RESPONSE_TIMEOUT),
);
let awaiting_stream = InboundState::Idle(substream);
self.inbound_substreams.insert(
self.current_inbound_substream_id,
InboundInfo {
state: awaiting_stream,
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
pending_items: VecDeque::with_capacity(expected_responses as usize),
delay_key: Some(delay_key),
protocol: req.protocol(),
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
request_start_time: Instant::now(),
remaining_chunks: expected_responses,
},
);
}
2019-07-09 05:44:23 +00:00
// If we received a goodbye, shutdown the connection.
if let InboundRequest::Goodbye(_) = req {
self.shutdown(None);
}
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Ok(RPCReceived::Request(
self.current_inbound_substream_id,
req,
)));
self.current_inbound_substream_id.0 += 1;
}
fn inject_fully_negotiated_outbound(
&mut self,
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
out: <Self::OutboundProtocol as OutboundUpgrade<NegotiatedSubstream>>::Output,
request_info: Self::OutboundOpenInfo,
) {
self.dial_negotiated -= 1;
let (id, request) = request_info;
let proto = request.protocol();
// accept outbound connections only if the handler is not deactivated
if matches!(self.state, HandlerState::Deactivated) {
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Outbound {
error: RPCError::HandlerRejected,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto,
id,
}));
}
// add the stream to substreams if we expect a response, otherwise drop the stream.
let expected_responses = request.expected_responses();
if expected_responses > 0 {
// new outbound request. Store the stream and tag the output.
let delay_key = self.outbound_substreams_delay.insert(
self.current_outbound_substream_id,
Duration::from_secs(RESPONSE_TIMEOUT),
);
let awaiting_stream = OutboundSubstreamState::RequestPendingResponse {
substream: Box::new(out),
request,
};
let expected_responses = if expected_responses > 1 {
// Currently enforced only for multiple responses
Some(expected_responses)
} else {
None
};
if self
.outbound_substreams
.insert(
self.current_outbound_substream_id,
OutboundInfo {
state: awaiting_stream,
delay_key,
proto,
remaining_chunks: expected_responses,
req_id: id,
},
)
.is_some()
{
crit!(self.log, "Duplicate outbound substream id"; "id" => self.current_outbound_substream_id);
}
self.current_outbound_substream_id.0 += 1;
}
}
2019-07-06 13:43:44 +00:00
fn inject_event(&mut self, rpc_event: Self::InEvent) {
match rpc_event {
RPCSend::Request(id, req) => self.send_request(id, req),
RPCSend::Response(inbound_id, response) => self.send_response(inbound_id, response),
RPCSend::Shutdown(reason) => self.shutdown(Some(reason)),
2019-07-06 13:43:44 +00:00
}
// In any case, we need the handler to process the event.
if let Some(waker) = &self.waker {
waker.wake_by_ref();
}
}
fn inject_dial_upgrade_error(
&mut self,
request_info: Self::OutboundOpenInfo,
error: ProtocolsHandlerUpgrErr<
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
<Self::OutboundProtocol as OutboundUpgrade<NegotiatedSubstream>>::Error,
>,
) {
let (id, req) = request_info;
if let ProtocolsHandlerUpgrErr::Upgrade(UpgradeError::Apply(RPCError::IoError(_))) = error {
self.outbound_io_error_retries += 1;
if self.outbound_io_error_retries < IO_ERROR_RETRIES {
self.send_request(id, req);
return;
}
}
// This dialing is now considered failed
self.dial_negotiated -= 1;
self.outbound_io_error_retries = 0;
// map the error
let error = match error {
ProtocolsHandlerUpgrErr::Timer => RPCError::InternalError("Timer failed"),
ProtocolsHandlerUpgrErr::Timeout => RPCError::NegotiationTimeout,
ProtocolsHandlerUpgrErr::Upgrade(UpgradeError::Apply(e)) => e,
ProtocolsHandlerUpgrErr::Upgrade(UpgradeError::Select(NegotiationError::Failed)) => {
RPCError::UnsupportedProtocol
}
ProtocolsHandlerUpgrErr::Upgrade(UpgradeError::Select(
NegotiationError::ProtocolError(e),
)) => match e {
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
ProtocolError::IoError(io_err) => RPCError::IoError(io_err.to_string()),
ProtocolError::InvalidProtocol => {
RPCError::InternalError("Protocol was deemed invalid")
}
ProtocolError::InvalidMessage | ProtocolError::TooManyProtocols => {
// Peer is sending invalid data during the negotiation phase, not
// participating in the protocol
RPCError::InvalidData
}
},
};
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Outbound {
error,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto: req.protocol(),
id,
}));
}
fn connection_keep_alive(&self) -> KeepAlive {
// Check that we don't have outbound items pending for dialing, nor dialing, nor
// established. Also check that there are no established inbound substreams.
// Errors and events need to be reported back, so check those too.
let should_shutdown = match self.state {
HandlerState::ShuttingDown(_) => {
self.dial_queue.is_empty()
&& self.outbound_substreams.is_empty()
&& self.inbound_substreams.is_empty()
&& self.events_out.is_empty()
&& self.dial_negotiated == 0
}
HandlerState::Deactivated => {
// Regardless of events, the timeout has expired. Force the disconnect.
true
}
_ => false,
};
if should_shutdown {
KeepAlive::No
} else {
KeepAlive::Yes
}
}
fn poll(
&mut self,
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
cx: &mut Context<'_>,
) -> Poll<
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
ProtocolsHandlerEvent<
Self::OutboundProtocol,
Self::OutboundOpenInfo,
Self::OutEvent,
Self::Error,
>,
> {
if let Some(waker) = &self.waker {
if waker.will_wake(cx.waker()) {
self.waker = Some(cx.waker().clone());
}
} else {
self.waker = Some(cx.waker().clone());
}
// return any events that need to be reported
if !self.events_out.is_empty() {
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
return Poll::Ready(ProtocolsHandlerEvent::Custom(self.events_out.remove(0)));
} else {
self.events_out.shrink_to_fit();
}
// Check if we are shutting down, and if the timer ran out
if let HandlerState::ShuttingDown(delay) = &self.state {
if delay.is_elapsed() {
self.state = HandlerState::Deactivated;
debug!(self.log, "Handler deactivated");
return Poll::Ready(ProtocolsHandlerEvent::Close(RPCError::InternalError(
"Shutdown timeout",
)));
}
}
// purge expired inbound substreams and send an error
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
loop {
match self.inbound_substreams_delay.poll_expired(cx) {
Poll::Ready(Some(Ok(inbound_id))) => {
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
// handle a stream timeout for various states
if let Some(info) = self.inbound_substreams.get_mut(inbound_id.get_ref()) {
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
// the delay has been removed
info.delay_key = None;
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Inbound {
error: RPCError::StreamTimeout,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto: info.protocol,
id: *inbound_id.get_ref(),
}));
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
if info.pending_items.back().map(|l| l.close_after()) == Some(false) {
// if the last chunk does not close the stream, append an error
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
info.pending_items.push_back(RPCCodedResponse::Error(
RPCResponseErrorCode::ServerError,
"Request timed out".into(),
));
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
}
}
Poll::Ready(Some(Err(e))) => {
warn!(self.log, "Inbound substream poll failed"; "error" => ?e);
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
// drops the peer if we cannot read the delay queue
return Poll::Ready(ProtocolsHandlerEvent::Close(RPCError::InternalError(
"Could not poll inbound stream timer",
)));
}
Poll::Pending | Poll::Ready(None) => break,
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
// purge expired outbound substreams
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
loop {
match self.outbound_substreams_delay.poll_expired(cx) {
Poll::Ready(Some(Ok(outbound_id))) => {
if let Some(OutboundInfo { proto, req_id, .. }) =
self.outbound_substreams.remove(outbound_id.get_ref())
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
{
let outbound_err = HandlerErr::Outbound {
id: req_id,
proto,
error: RPCError::StreamTimeout,
};
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
// notify the user
return Poll::Ready(ProtocolsHandlerEvent::Custom(Err(outbound_err)));
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
} else {
crit!(self.log, "timed out substream not in the books"; "stream_id" => outbound_id.get_ref());
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
}
}
Poll::Ready(Some(Err(e))) => {
warn!(self.log, "Outbound substream poll failed"; "error" => ?e);
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
return Poll::Ready(ProtocolsHandlerEvent::Close(RPCError::InternalError(
"Could not poll outbound stream timer",
)));
}
Poll::Pending | Poll::Ready(None) => break,
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
// when deactivated, close all streams
let deactivated = matches!(self.state, HandlerState::Deactivated);
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
// drive inbound streams that need to be processed
let mut substreams_to_remove = Vec::new(); // Closed substreams that need to be removed
for (id, info) in self.inbound_substreams.iter_mut() {
loop {
match std::mem::replace(&mut info.state, InboundState::Poisoned) {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// This state indicates that we are not currently sending any messages to the
// peer. We need to check if there are messages to send, if so, start the
// sending process.
InboundState::Idle(substream) if !deactivated => {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// Process one more message if one exists.
if let Some(message) = info.pending_items.pop_front() {
// If this is the last chunk, terminate the stream.
let last_chunk = info.remaining_chunks <= 1;
let fut =
send_message_to_inbound_substream(substream, message, last_chunk)
.boxed();
// Update the state and try to process this further.
info.state = InboundState::Busy(Box::pin(fut));
} else {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// There is nothing left to process. Set the stream to idle and
// move on to the next one.
info.state = InboundState::Idle(substream);
break;
}
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// This state indicates we are not sending at the moment, and the handler is in
// the process of closing the connection to the peer.
InboundState::Idle(mut substream) => {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// Handler is deactivated, close the stream and mark it for removal
match substream.close().poll_unpin(cx) {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// if we can't close right now, put the substream back and try again
// immediately, continue to do this until we close the substream.
Poll::Pending => info.state = InboundState::Idle(substream),
Poll::Ready(res) => {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// The substream closed, we remove it from the mapping and remove
// the timeout
substreams_to_remove.push(*id);
if let Some(ref delay_key) = info.delay_key {
self.inbound_substreams_delay.remove(delay_key);
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// If there was an error in shutting down the substream report the
// error
if let Err(error) = res {
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Inbound {
error,
proto: info.protocol,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
id: *id,
}));
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// If there are still requests to send, report that we are in the
// process of closing a connection to the peer and that we are not
// processing these excess requests.
if info.pending_items.back().map(|l| l.close_after()) == Some(false)
{
// if the request was still active, report back to cancel it
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Inbound {
error: RPCError::HandlerRejected,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto: info.protocol,
id: *id,
}));
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
}
break;
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// This state indicates that there are messages to send back to the peer.
// The future here is built by the `process_inbound_substream` function. The
// output returns a substream and whether it was closed in this operation.
InboundState::Busy(mut fut) => {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// Check if the future has completed (i.e we have completed sending all our
// pending items)
match fut.poll_unpin(cx) {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// The pending messages have been sent successfully
Poll::Ready(Ok((substream, substream_was_closed)))
if !substream_was_closed =>
{
// The substream is still active, decrement the remaining
// chunks expected.
info.remaining_chunks = info.remaining_chunks.saturating_sub(1);
// If this substream has not ended, we reset the timer.
// Each chunk is allowed RESPONSE_TIMEOUT to be sent.
if let Some(ref delay_key) = info.delay_key {
self.inbound_substreams_delay
.reset(delay_key, Duration::from_secs(RESPONSE_TIMEOUT));
}
// The stream may be currently idle. Attempt to process more
// elements
if !deactivated && !info.pending_items.is_empty() {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// Process one more message if one exists.
if let Some(message) = info.pending_items.pop_front() {
// If this is the last chunk, terminate the stream.
let last_chunk = info.remaining_chunks <= 1;
let fut = send_message_to_inbound_substream(
substream, message, last_chunk,
)
.boxed();
// Update the state and try to process this further.
info.state = InboundState::Busy(Box::pin(fut));
}
} else {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// There is nothing left to process. Set the stream to idle and
// move on to the next one.
info.state = InboundState::Idle(substream);
break;
}
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// The pending messages have been sent successfully and the stream has
// terminated
Poll::Ready(Ok((_substream, _substream_was_closed))) => {
// The substream has closed. Remove the timeout related to the
// substream.
substreams_to_remove.push(*id);
if let Some(ref delay_key) = info.delay_key {
self.inbound_substreams_delay.remove(delay_key);
}
// BlocksByRange is the one that typically consumes the most time.
// Its useful to log when the request was completed.
if matches!(info.protocol, Protocol::BlocksByRange) {
debug!(self.log, "BlocksByRange Response sent"; "duration" => Instant::now().duration_since(info.request_start_time).as_secs());
}
// There is nothing more to process on this substream as it has
// been closed. Move on to the next one.
break;
}
// An error occurred when trying to send a response.
// This means we terminate the substream.
Poll::Ready(Err(error)) => {
// Remove the stream timeout from the mapping
substreams_to_remove.push(*id);
if let Some(ref delay_key) = info.delay_key {
self.inbound_substreams_delay.remove(delay_key);
}
// Report the error that occurred during the send process
self.events_out.push(Err(HandlerErr::Inbound {
error,
proto: info.protocol,
id: *id,
}));
if matches!(info.protocol, Protocol::BlocksByRange) {
debug!(self.log, "BlocksByRange Response failed"; "duration" => info.request_start_time.elapsed().as_secs());
}
break;
}
// The sending future has not completed. Leave the state as busy and
// try to progress later.
Poll::Pending => {
info.state = InboundState::Busy(fut);
break;
}
};
}
InboundState::Poisoned => unreachable!("Poisoned inbound substream"),
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// Remove closed substreams
for inbound_id in substreams_to_remove {
self.inbound_substreams.remove(&inbound_id);
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
// drive outbound streams that need to be processed
for outbound_id in self.outbound_substreams.keys().copied().collect::<Vec<_>>() {
// get the state and mark it as poisoned
let (mut entry, state) = match self.outbound_substreams.entry(outbound_id) {
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
Entry::Occupied(mut entry) => {
let state = std::mem::replace(
&mut entry.get_mut().state,
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
OutboundSubstreamState::Poisoned,
);
(entry, state)
}
Entry::Vacant(_) => unreachable!(),
};
match state {
OutboundSubstreamState::RequestPendingResponse {
substream,
request: _,
} if deactivated => {
// the handler is deactivated. Close the stream
entry.get_mut().state = OutboundSubstreamState::Closing(substream);
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
self.events_out.push(Err(HandlerErr::Outbound {
error: RPCError::HandlerRejected,
More metrics + RPC tweaks (#2041) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)
2020-12-08 03:55:50 +00:00
proto: entry.get().proto,
id: entry.get().req_id,
}))
}
OutboundSubstreamState::RequestPendingResponse {
mut substream,
request,
} => match substream.poll_next_unpin(cx) {
Poll::Ready(Some(Ok(response))) => {
if request.expected_responses() > 1 && !response.close_after() {
let substream_entry = entry.get_mut();
let delay_key = &substream_entry.delay_key;
// chunks left after this one
let remaining_chunks = substream_entry
.remaining_chunks
.map(|count| count.saturating_sub(1))
.unwrap_or_else(|| 0);
if remaining_chunks == 0 {
// this is the last expected message, close the stream as all expected chunks have been received
substream_entry.state = OutboundSubstreamState::Closing(substream);
} else {
// If the response chunk was expected update the remaining number of chunks expected and reset the Timeout
substream_entry.state =
OutboundSubstreamState::RequestPendingResponse {
substream,
request,
};
substream_entry.remaining_chunks = Some(remaining_chunks);
self.outbound_substreams_delay
.reset(delay_key, Duration::from_secs(RESPONSE_TIMEOUT));
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
} else {
// either this is a single response request or this response closes the
// stream
entry.get_mut().state = OutboundSubstreamState::Closing(substream);
}
// Check what type of response we got and report it accordingly
let id = entry.get().req_id;
let proto = entry.get().proto;
let received = match response {
RPCCodedResponse::StreamTermination(t) => {
Ok(RPCReceived::EndOfStream(id, t))
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
RPCCodedResponse::Success(resp) => Ok(RPCReceived::Response(id, resp)),
RPCCodedResponse::Error(ref code, ref r) => Err(HandlerErr::Outbound {
id,
proto,
error: RPCError::ErrorResponse(*code, r.to_string()),
}),
};
return Poll::Ready(ProtocolsHandlerEvent::Custom(received));
}
Poll::Ready(None) => {
// stream closed
// if we expected multiple streams send a stream termination,
// else report the stream terminating only.
//trace!(self.log, "RPC Response - stream closed by remote");
// drop the stream
let delay_key = &entry.get().delay_key;
let request_id = entry.get().req_id;
self.outbound_substreams_delay.remove(delay_key);
entry.remove_entry();
// notify the application error
if request.expected_responses() > 1 {
// return an end of stream result
return Poll::Ready(ProtocolsHandlerEvent::Custom(Ok(
RPCReceived::EndOfStream(request_id, request.stream_termination()),
)));
}
// else we return an error, stream should not have closed early.
let outbound_err = HandlerErr::Outbound {
id: request_id,
proto: request.protocol(),
error: RPCError::IncompleteStream,
};
return Poll::Ready(ProtocolsHandlerEvent::Custom(Err(outbound_err)));
}
Poll::Pending => {
entry.get_mut().state =
OutboundSubstreamState::RequestPendingResponse { substream, request }
}
Poll::Ready(Some(Err(e))) => {
// drop the stream
let delay_key = &entry.get().delay_key;
self.outbound_substreams_delay.remove(delay_key);
let outbound_err = HandlerErr::Outbound {
id: entry.get().req_id,
proto: entry.get().proto,
error: e,
};
entry.remove_entry();
return Poll::Ready(ProtocolsHandlerEvent::Custom(Err(outbound_err)));
}
},
OutboundSubstreamState::Closing(mut substream) => {
match Sink::poll_close(Pin::new(&mut substream), cx) {
Poll::Ready(_) => {
// drop the stream and its corresponding timeout
let delay_key = &entry.get().delay_key;
let protocol = entry.get().proto;
let request_id = entry.get().req_id;
self.outbound_substreams_delay.remove(delay_key);
entry.remove_entry();
// report the stream termination to the user
//
// Streams can be terminated here if a responder tries to
// continue sending responses beyond what we would expect. Here
// we simply terminate the stream and report a stream
// termination to the application
let termination = match protocol {
Protocol::BlocksByRange => Some(ResponseTermination::BlocksByRange),
Protocol::BlocksByRoot => Some(ResponseTermination::BlocksByRoot),
_ => None, // all other protocols are do not have multiple responses and we do not inform the user, we simply drop the stream.
};
if let Some(termination) = termination {
return Poll::Ready(ProtocolsHandlerEvent::Custom(Ok(
RPCReceived::EndOfStream(request_id, termination),
)));
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
}
Poll::Pending => {
entry.get_mut().state = OutboundSubstreamState::Closing(substream);
}
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
}
OutboundSubstreamState::Poisoned => {
crit!(self.log, "Poisoned outbound substream");
unreachable!("Coding Error: Outbound substream is poisoned")
}
}
}
2019-07-06 13:43:44 +00:00
// establish outbound substreams
if !self.dial_queue.is_empty() && self.dial_negotiated < self.max_dial_negotiated {
self.dial_negotiated += 1;
let (id, req) = self.dial_queue.remove(0);
self.dial_queue.shrink_to_fit();
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
return Poll::Ready(ProtocolsHandlerEvent::OutboundSubstreamRequest {
protocol: SubstreamProtocol::new(
OutboundRequestContainer {
req: req.clone(),
fork_context: self.fork_context.clone(),
},
(),
)
.map_info(|()| (id, req)),
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
});
}
// Check if we have completed sending a goodbye, disconnect.
if let HandlerState::ShuttingDown(_) = self.state {
if self.dial_queue.is_empty()
&& self.outbound_substreams.is_empty()
&& self.inbound_substreams.is_empty()
&& self.events_out.is_empty()
&& self.dial_negotiated == 0
{
return Poll::Ready(ProtocolsHandlerEvent::Close(RPCError::Disconnected));
}
}
Stable futures (#879) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-05-17 11:16:48 +00:00
Poll::Pending
}
}
Testnet compatible network upgrade (#587) * Create libp2p instance * Change logger to stdlog * test_connection initial commit * Add gossipsub test * Delete tests in network crate * Add test module * Clean tests * Remove dependency on discovery * Working publish between 2 nodes TODO: Publish should be called just once * Working 2 peer gossipsub test with additional events * Cleanup test * Add rpc test * Star topology discovery WIP * build_nodes builds and connects n nodes. Increase nodes in gossipsub test * Add unsubscribe method and expose reference to gossipsub object for gossipsub tests * Add gossipsub message forwarding test * Fix gossipsub forward test * Test improvements * Remove discovery tests * Simplify gossipsub forward test topology * Add helper functions for topology building * Clean up tests * Update naming to new network spec * Correct ssz encoding of protocol names * Further additions to network upgrade * Initial network spec update WIP * Temp commit * Builds one side of the streamed RPC responses * Temporary commit * Propagates streaming changes up into message handler * Intermediate network update * Partial update in upgrading to the new network spec * Update dependencies, remove redundant deps * Correct sync manager for block stream handling * Re-write of RPC handler, improves efficiency and corrects bugs * Stream termination update * Completed refactor of rpc handler * Remove crates * Correct compile issues associated with test merge * Build basic tests and testing structure for eth2-libp2p * Enhance RPC tests and add logging * Complete RPC testing framework and STATUS test * Decoding bug fixes, log improvements, stream test * Clean up RPC handler logging * Decoder bug fix, empty block stream test * Add BlocksByRoot RPC test * Add Goodbye RPC test * Syncing and stream handling bug fixes and performance improvements * Applies discv5 bug fixes * Adds DHT IP filtering for lighthouse - currently disabled * Adds randomized network propagation as a CLI arg * Add disconnect functionality * Adds attestation handling and parent lookup * Adds RPC error handling for the sync manager * Allow parent's blocks to be already processed * Update github workflow * Adds reviewer suggestions
2019-11-27 01:47:46 +00:00
impl slog::Value for SubstreamId {
fn serialize(
&self,
record: &slog::Record,
key: slog::Key,
serializer: &mut dyn slog::Serializer,
) -> slog::Result {
slog::Value::serialize(&self.0, record, key, serializer)
}
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
/// Creates a future that can be polled that will send any queued message to the peer.
///
/// This function returns the given substream, along with whether it has been closed or not. Any
/// error that occurred with sending a message is reported also.
async fn send_message_to_inbound_substream<TSpec: EthSpec>(
mut substream: InboundSubstream<TSpec>,
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
message: RPCCodedResponse<TSpec>,
last_chunk: bool,
) -> Result<(InboundSubstream<TSpec>, bool), RPCError> {
if matches!(message, RPCCodedResponse::StreamTermination(_)) {
substream.close().await.map(|_| (substream, true))
} else {
// chunks that are not stream terminations get sent, and the stream is closed if
// the response is an error
let is_error = matches!(message, RPCCodedResponse::Error(..));
let send_result = substream.send(message).await;
// If we need to close the substream, do so and return the result.
if last_chunk || is_error || send_result.is_err() {
let close_result = substream.close().await.map(|_| (substream, true));
// If there was an error in sending, return this error, otherwise, return the
// result of closing the substream.
if let Err(e) = send_result {
return Err(e);
} else {
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
return close_result;
}
}
Investigate and correct RPC Response Timeouts (#2804) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this
2021-11-16 03:42:25 +00:00
// Everything worked as expected return the result.
send_result.map(|_| (substream, false))
}
}