lighthouse

Author	SHA1	Message	Date
Michael Sproul	98c8ac1a87	Fix typo in peer state transition log (#3224 ) ## Issue Addressed We were logging `out_finalized_epoch` instead of `our_finalized_epoch`. I noticed this ages ago but only just got around to fixing it. ## Additional Info I also reformatted the log line to respect the line length limit (`rustfmt` won't do it because it gets confused by the `;` in slog's log macros).	2022-05-31 06:09:10 +00:00
Divma	788b6af3c4	Remove sync await points (#3036 ) ## Issue Addressed Removes the await points in sync waiting for a processor response for rpc block processing. Built on top of #3029 This also handles a couple of bugs in the previous code and adds a relatively comprehensive test suite.	2022-03-23 01:09:39 +00:00
Paul Hauner	267d8babc8	Prepare proposer (#3043 ) ## Issue Addressed Resolves #2936 ## Proposed Changes Adds functionality for calling [`validator/prepare_beacon_proposer`](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Validator/prepareBeaconProposer) in advance. There is a `BeaconChain::prepare_beacon_proposer` method which, which called, computes the proposer for the next slot. If that proposer has been registered via the `validator/prepare_beacon_proposer` API method, then the `beacon_chain.execution_layer` will be provided the `PayloadAttributes` for us in all future forkchoiceUpdated calls. An artificial forkchoiceUpdated call will be created 4s before each slot, when the head updates and when a validator updates their information. Additionally, I added strict ordering for calls from the `BeaconChain` to the `ExecutionLayer`. I'm not certain the `ExecutionLayer` will always maintain this ordering, but it's a good start to have consistency from the `BeaconChain`. There are some deadlock opportunities introduced, they are documented in the code. ## Additional Info - ~~Blocked on #2837~~ Co-authored-by: realbigsean <seananderson33@GMAIL.com>	2022-03-09 00:42:05 +00:00
Divma	4bf1af4e85	Custom RPC request management for sync (#3029 ) ## Proposed Changes Make `lighthouse_network` generic over request ids, now usable by sync	2022-03-02 22:07:17 +00:00
Divma	9964f5afe5	Document why we hash downloaded blocks for both sync algs (#2927 ) ## Proposed Changes Initially the idea was to remove hashing of blocks in backfill sync. After considering it more, we conclude that we need to do it in both (forward and backfill) anyway. But since we forgot why we were doing it in the first place, this PR documents this logic. Future us should find it useful Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>	2022-01-26 23:14:22 +00:00
Paul Hauner	aaa5344eab	Add peer score adjustment msgs (#2901 ) ## Issue Addressed N/A ## Proposed Changes This PR adds the `msg` field to `Peer score adjusted` log messages. These `msg` fields help identify why a peer was banned. Example: ``` Jan 11 04:18:48.096 DEBG Peer score adjusted score: -100.00, peer_id: 16Uiu2HAmQskxKWWGYfginwZ51n5uDbhvjHYnvASK7PZ5gBdLmzWj, msg: attn_unknown_head, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -27.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -100.00, peer_id: 16Uiu2HAmQskxKWWGYfginwZ51n5uDbhvjHYnvASK7PZ5gBdLmzWj, msg: attn_unknown_head, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -28.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -29.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p ``` There is also a `libp2p_report_peer_msgs_total` metrics which allows us to see count of reports per `msg` tag. ## Additional Info NA	2022-01-12 05:32:14 +00:00
Paul Hauner	4848e53155	Avoid peer penalties on internal errors for batch block import (#2898 ) ## Issue Addressed NA ## Proposed Changes I've observed some Prater nodes (and potentially some mainnet nodes) banning peers due to validator pubkey cache lock timeouts. For the `BeaconChainError`-type of errors, they're caused by internal faults and we can't necessarily tell if the peer is bad or not. I think this is causing us to ban peers unnecessarily when running on under-resourced machines. ## Additional Info NA	2022-01-11 05:33:28 +00:00
Michael Sproul	2c07a72980	Revert peer DB changes from #2724 (#2828 ) ## Proposed Changes This reverts commit `53562010ec` from PR #2724 Hopefully this will restore the reliability of the sync simulator.	2021-11-25 03:45:52 +00:00
Divma	53562010ec	Move peer db writes to eth2 libp2p (#2724 ) ## Issue Addressed Part of a bigger effort to make the network globals read only. This moves all writes to the `PeerDB` to the `eth2_libp2p` crate. Limiting writes to the peer manager is a slightly more complicated issue for a next PR, to keep things reviewable. ## Proposed Changes - Make the peers field in the globals a private field. - Allow mutable access to the peers field to `eth2_libp2p` for now. - Add a new network message to update the sync state. Co-authored-by: Age Manning <Age@AgeManning.com>	2021-11-19 04:42:31 +00:00
Age Manning	df40700ddd	Rename eth2_libp2p to lighthouse_network (#2702 ) ## Description The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration. As of this writing it currently houses the following high-level lighthouse-specific logic: - Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings - Integration and handling of ENRs with respect to libp2p and eth2 - Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers. - Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc. - Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api. - Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks. - Lighthouse specific types for managing gossipsub topics, sync status and ENR fields - Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring - Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p. Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network` Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-10-19 00:30:39 +00:00
Age Manning	0aee7ec873	Refactor Peerdb and PeerManager (#2660 ) ## Proposed Changes This is a refactor of the PeerDB and PeerManager. A number of bugs have been surfacing around the connection state of peers and their interaction with the score state. This refactor tightens the mutability properties of peers such that only specific modules are able to modify the state of peer information preventing inadvertant state changes that can lead to our local peer manager db being out of sync with libp2p. Further, the logic around connection and scoring was quite convoluted and the distinction between the PeerManager and Peerdb was not well defined. Although these issues are not fully resolved, this PR is step to cleaning up this logic. The peerdb solely manages most mutability operations of peers leaving high-order logic to the peer manager. A single `update_connection_state()` function has been added to the peer-db making it solely responsible for modifying the peer's connection state. The way the peer's scores can be modified have been reduced to three simple functions (`update_scores()`, `update_gossipsub_scores()` and `report_peer()`). This prevents any add-hoc modifications of scores and only natural processes of score modification is allowed which simplifies the reasoning of score and state changes.	2021-10-11 02:45:06 +00:00
Age Manning	29a8865d07	Consistent tracking of disconnected peers (#2650 ) ## Issue Addressed N/A ## Proposed Changes When peers switching to a disconnecting state, decrement the disconnected peers counter. This also downgrades some crit logs to errors. I've also added a re-sync point when peers get unbanned the disconnected peer count will match back to the number of disconnected peers if it has gone out of sync previously.	2021-09-30 04:31:43 +00:00
Mac L	4c510f8f6b	Add `BlockTimesCache` to allow additional block delay metrics (#2546 ) ## Issue Addressed Closes #2528 ## Proposed Changes - Add `BlockTimesCache` to provide block timing information to `BeaconChain`. This allows additional metrics to be calculated for blocks that are set as head too late. - Thread the `seen_timestamp` of blocks received from RPC responses (except blocks from syncing) through to the sync manager, similar to what is done for blocks from gossip. ## Additional Info This provides the following additional metrics: - `BEACON_BLOCK_OBSERVED_SLOT_START_DELAY_TIME` - The delay between the start of the slot and when the block was first observed. - `BEACON_BLOCK_IMPORTED_OBSERVED_DELAY_TIME` - The delay between when the block was first observed and when the block was imported. - `BEACON_BLOCK_HEAD_IMPORTED_DELAY_TIME` - The delay between when the block was imported and when the block was set as head. The metric `BEACON_BLOCK_IMPORTED_SLOT_START_DELAY_TIME` was removed. A log is produced when a block is set as head too late, e.g.: ``` Aug 27 03:46:39.006 DEBG Delayed head block set_as_head_delay: Some(21.731066ms), imported_delay: Some(119.929934ms), observed_delay: Some(3.864596988s), block_delay: 4.006257988s, slot: 1931331, proposer_index: 24294, block_root: 0x937602c89d3143afa89088a44bdf4b4d0d760dad082abacb229495c048648a9e, service: beacon ```	2021-09-30 04:31:41 +00:00
Michael Sproul	9667dc2f03	Implement checkpoint sync (#2244 ) ## Issue Addressed Closes #1891 Closes #1784 ## Proposed Changes Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint. ## Additional Info - [x] Return unavailable status for out-of-range blocks requested by peers (#2561) - [x] Implement sync daemon for fetching historical blocks (#2561) - [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module) - [x] Consistency check for initial block + state - [x] Fetch the initial state and block from a beacon node HTTP endpoint - [x] Don't crash fetching beacon states by slot from the API - [x] Background service for state reconstruction, triggered by CLI flag or API call. Considered out of scope for this PR: - Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification) Co-authored-by: Diva M <divma@protonmail.com>	2021-09-22 00:37:28 +00:00
Michael Sproul	b4689e20c6	Altair consensus changes and refactors (#2279 ) ## Proposed Changes Implement the consensus changes necessary for the upcoming Altair hard fork. ## Additional Info This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-07-09 06:15:32 +00:00
Paul Hauner	93100f221f	Make less logs for attn with unknown head (#2395 ) ## Issue Addressed NA ## Proposed Changes I am starting to see a lot of slog-async overflows (i.e., too many logs) on Prater whenever we see attestations for an unknown block. Since these logs are identical (except for peer id) and we expose volume/count of these errors via `metrics::GOSSIP_ATTESTATION_ERRORS_PER_TYPE`, I took the following actions to remove them from `DEBUG` logs: - Push the "Attestation for unknown block" log to trace. - Add a debug log in `search_for_block`. In effect, this should serve as a de-duped version of the previous, downgraded log. ## Additional Info TBC	2021-06-07 02:34:09 +00:00
Paul Hauner	a764c3b247	Handle early blocks (#2155 ) ## Issue Addressed NA ## Problem this PR addresses There's an issue where Lighthouse is banning a lot of peers due to the following sequence of events: 1. Gossip block 0xabc arrives ~200ms early - It is propagated across the network, with respect to [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#why-is-there-maximum_gossip_clock_disparity-when-validating-slot-ranges-of-messages-in-gossip-subnets). - However, it is not imported to our database since the block is early. 2. Attestations for 0xabc arrive, but the block was not imported. - The peer that sent the attestation is down-voted. - Each unknown-block attestation causes a score loss of 1, the peer is banned at -100. - When the peer is on an attestation subnet there can be hundreds of attestations, so the peer is banned quickly (before the missed block can be obtained via rpc). ## Potential solutions I can think of three solutions to this: 1. Wait for attestation-queuing (#635) to arrive and solve this. - Easy - Not immediate fix. - Whilst this would work, I don't think it's a perfect solution for this particular issue, rather (3) is better. 1. Allow importing blocks with a tolerance of `MAXIMUM_GOSSIP_CLOCK_DISPARITY`. - Easy - ~~I have implemented this, for now.~~ 1. If a block is verified for gossip propagation (i.e., signature verified) and it's within `MAXIMUM_GOSSIP_CLOCK_DISPARITY`, then queue it to be processed at the start of the appropriate slot. - More difficult - Feels like the best solution, I will try to implement this. This PR takes approach (3). ## Changes included - Implement the `block_delay_queue`, based upon a [`DelayQueue`](https://docs.rs/tokio-util/0.6.3/tokio_util/time/delay_queue/struct.DelayQueue.html) which can store blocks until it's time to import them. - Add a new `DelayedImportBlock` variant to the `beacon_processor::WorkEvent` enum to handle this new event. - In the `BeaconProcessor`, refactor a `tokio::select!` to a struct with an explicit `Stream` implementation. I experienced some issues with `tokio::select!` in the block delay queue and I also found it hard to debug. I think this explicit implementation is nicer and functionally equivalent (apart from the fact that `tokio::select!` randomly chooses futures to poll, whereas now we're deterministic). - Add a testing framework to the `beacon_processor` module that tests this new block delay logic. I also tested a handful of other operations in the beacon processor (attns, slashings, exits) since it was super easy to copy-pasta the code from the `http_api` tester. - To implement these tests I added the concept of an optional `work_journal_tx` to the `BeaconProcessor` which will spit out a log of events. I used this in the tests to ensure that things were happening as I expect. - The tests are a little racey, but it's hard to avoid that when testing timing-based code. If we see CI failures I can revise. I haven't observed any failures due to races on my machine or on CI yet. - To assist with testing I allowed for directly setting the time on the `ManualSlotClock`. - I gave the `beacon_processor::Worker` a `Toolbox` for two reasons; (a) it avoids changing tons of function sigs when you want to pass a new object to the worker and (b) it seemed cute.	2021-02-24 03:08:52 +00:00
Age Manning	2931b05582	Update libp2p (#2101 ) This is a little bit of a tip-of-the-iceberg PR. It houses a lot of code changes in the libp2p dependency. This needs a bit of thorough testing before merging. The primary code changes are: - General libp2p dependency update - Gossipsub refactor to shift compression into gossipsub providing performance improvements and improved API for handling compression Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-12-23 07:53:36 +00:00
divma	8fcd22992c	No string in slog (#2017 ) ## Issue Addressed Following slog's documentation, this should help a bit with string allocations. I left it run for two days and mem usage is lower. This is of course anecdotal, but shouldn't harm anyway ## Proposed Changes remove `String` creation in logs when possible	2020-11-30 10:33:00 +00:00
divma	3b4afc27bf	Status race condition (#1967 ) ## Issue Addressed Sync stalls due to race conditions between dc notifications and status processing	2020-11-25 02:15:38 +00:00
divma	d727e55abe	Move some rpc processing to the beacon_processor (#1936 ) ## Issue Addressed `BlocksByRange` requests were the main culprit of a series of timeouts to peer's requests in general because they produce build up in the router's processor. Those were moved to the blocking executor but a task is being spawned for each; also not ideal since the amount of resources we give to those is not controlled ## Proposed Changes - Move `BlocksByRange` and `BlocksByRoots` to the `beacon_processor`. The processor crafts the responses and sends them. - Move too the processing of `StatusMessage`s from other peers. This is a fast operation but it can also build up and won't scale if we keep it in the router (processing one at the time). These don't need to send an answer, so there is no harm in processing them "later" if that were to happen. Sending responses to status requests is still in the router, so we answer as soon as we see them. - Some "extras" that are basically clean up: - Split the `Worker` logic in sync methods (chain processing and rpc blocks), gossip methods (the majority of methods) and rpc methods (the new ones) - Move the `status_message` function previously provided by the router's processor to a more central place since it is used by the router, sync, network_context and beacon_processor - Some spelling ## Additional Info What's left to decide/test more thoroughly is the length of the queues and the priority rules. @paulhauner suggested at some point to put status above attestations, and @AgeManning had described an importance of "protecting gossipsub" so my solution is leaving status requests in the router and RPC methods below attestations. Slashings and Exits are at the end.	2020-11-19 23:33:44 +00:00
divma	8a16548715	Misc Peer sync info adjustments (#1896 ) ## Issue Addressed #1856 ## Proposed Changes - For clarity, the router's processor now only decides if a peer is compatible and it disconnects it or sends it to sync accordingly. No logic here regarding how useful is the peer. - Update peer_sync_info's rules - Add an `IrrelevantPeer` sync status to account for incompatible peers (maybe this should be "IncompatiblePeer" now that I think about it?) this state is update upon receiving an internal goodbye in the peer manager - Misc code cleanups - Reduce the need to create `StatusMessage`s (and thus, `Arc` accesses ) - Add missing calls to update the global sync state The overall effect should be: - More peers recognized as Behind, and less as Unknown - Peers identified as incompatible	2020-11-13 09:00:10 +00:00
divma	6c0c050fbb	Tweak head syncing (#1845 ) ## Issue Addressed Fixes head syncing ## Proposed Changes - Get back to statusing peers after removing chain segments and making the peer manager deal with status according to the Sync status, preventing an old known deadlock - Also a bug where a chain would get removed if the optimistic batch succeeds being empty ## Additional Info Tested on Medalla and looking good	2020-11-01 23:37:39 +00:00
divma	668513b67e	Sync state adjustments (#1804 ) check for advanced peers and the state of the chain wrt the clock slot to decide if a chain is or not synced /transitioning to a head sync. Also a fix that prevented getting the right state while syncing heads	2020-10-22 00:26:06 +00:00
divma	2acf75785c	More sync updates (#1791 ) ## Issue Addressed #1614 and a couple of sync-stalling problems, the most important is a cyclic dependency between the sync manager and the peer manager	2020-10-20 22:34:18 +00:00
Paul Hauner	da44821e39	Clean up obsolete TODOs (#1734 ) Squashed commit of the following: commit f99373cbaec9adb2bdbae3f7e903284327962083 Author: Age Manning <Age@AgeManning.com> Date: Mon Oct 5 18:44:09 2020 +1100 Clean up obsolute TODOs	2020-10-05 21:08:14 +11:00
Age Manning	240181e840	Upgrade discovery and restructure task execution (#1693 ) * Initial rebase * Remove old code * Correct release tests * Rebase commit * Remove eth2-testnet dep on eth2libp2p * Remove crates lost in rebase * Remove unused dep	2020-10-05 18:45:54 +11:00
divma	6997776494	Sync fixes (#1716 ) ## Issue Addressed chain state inconsistencies ## Proposed Changes - a batch can be fake-failed by Range if it needs to move a peer to another chain. The peer will still send blocks/ errors / produce timeouts for those requests, so check when we get a response from the RPC that the request id matches, instead of only the peer, since a re-request can be directed to the same peer. - if an optimistic batch succeeds, store the attempt to avoid trying it again when quickly switching chains. Also, use it only if ahead of our current target, instead of the segment's start epoch	2020-10-05 17:33:36 +11:00
divma	b8013b7b2c	Super Silky Smooth Syncs, like a Sir (#1628 ) ## Issue Addressed In principle.. closes #1551 but in general are improvements for performance, maintainability and readability. The logic for the optimistic sync in actually simple ## Proposed Changes There are miscellaneous things here: - Remove unnecessary `BatchProcessResult::Partial` to simplify the batch validation logic - Make batches a state machine. This is done to ensure batch state transitions respect our logic (this was previously done by moving batches between `Vec`s) and to ease the cognitive load of the `SyncingChain` struct - Move most batch-related logic to the batch - Remove `PendingBatches` in favor of a map of peers to their batches. This is to avoid duplicating peers inside the chain (peer_pool and pending_batches) - Add `must_use` decoration to the `ProcessingResult` so that chains that request to be removed are handled accordingly. This also means that chains are now removed in more places than before to account for unhandled cases - Store batches in a sorted map (`BTreeMap`) access is not O(1) but since the number of _active_ batches is bounded this should be fast, and saves performing hashing ops. Batches are indexed by the epoch they start. Sorted, to easily handle chain advancements (range logic) - Produce the chain Id from the identifying fields: target root and target slot. This, to guarantee there can't be duplicated chains and be able to consistently search chains by either Id or checkpoint - Fix chain_id not being present in all chain loggers - Handle mega-edge case where the processor's work queue is full and the batch can't be sent. In this case the chain would lose the blocks, remain in a "syncing" state and waiting for a result that won't arrive, effectively stalling sync. - When a batch imports blocks or the chain starts syncing with a local finalized epoch greater that the chain's start epoch, the chain is advanced instead of reset. This is to avoid losing download progress and validate batches faster. This also means that the old `start_epoch` now means "current first unvalidated batch", so it represents more accurately the progress of the chain. - Batch status peers from the same chain to reduce Arc access. - Handle a couple of cases where the retry counters for a batch were not updated/checked are now handled via the batch state machine. Basically now if we forget to do it, we will know. - Do not send back the blocks from the processor to the batch. Instead register the attempt before sending the blocks (does not count as failed) - When re-requesting a batch, try to avoid not only the last failed peer, but all previous failed peers. - Optimize requesting batches ahead in the buffer by shuffling idle peers just once (this is just addressing a couple of old TODOs in the code) - In chain_collection, store chains by their id in a map - Include a mapping from request_ids to (chain, batch) that requested the batch to avoid the double O(n) search on block responses - Other stuff: - impl `slog::KV` for batches - impl `slog::KV` for syncing chains - PSA: when logging, we can use `%thing` if `thing` implements `Display`. Same for `?` and `Debug` ### Optimistic syncing: Try first the batch that contains the current head, if the batch imports any block, advance the chain. If not, if this optimistic batch is inside the current processing window leave it there for future use, if not drop it. The tolerance for this block is the same for downloading, but just once for processing Co-authored-by: Age Manning <Age@AgeManning.com>	2020-09-23 06:29:55 +00:00
Age Manning	80e52a0263	Subscribe to core topics after sync (#1613 ) ## Issue Addressed N/A ## Proposed Changes Prevent subscribing to core gossipsub topics until after we have achieved a full sync. This prevents us censoring gossipsub channels, getting penalised in gossipsub 1.1 scoring and saves us computation time in attempting to validate gossipsub messages which we will be unable to do with a non-sync'd chain.	2020-09-23 03:26:33 +00:00
Age Manning	3bb30754d9	Keep track of failed head chains and prevent re-lookups (#1534 ) ## Overview There are forked chains which get referenced by blocks and attestations on a network. Typically if these chains are very long, we stop looking up the chain and downvote the peer. In extreme circumstances, many peers are on many chains, the chains can be very deep and become time consuming performing lookups. This PR adds a cache to known failed chain lookups. This prevents us from starting a parent-lookup (or stopping one half way through) if we have attempted the chain lookup in the past.	2020-08-18 03:54:09 +00:00
divma	46dbf027af	Do not reset batch ids & redownload out of range batches (#1528 ) The changes are somewhat simple but should solve two issues: - When quickly changing between chains once and a second time back again, batchIds would collide and cause havoc. - If we got an out of range response from a peer, sync would remain in syncing but without advancing Changes: - remove the batch id. Identify each batch (inside a chain) by its starting epoch. Target epochs for downloading and processing now advance by EPOCHS_PER_BATCH - for the same reason, move the "to_be_downloaded_id" to be an epoch - remove a sneaky line that dropped an out of range batch without downloading it - bonus: put the chain_id in the log given to the chain. This is why explicitly logging the chain_id is removed	2020-08-18 01:29:51 +00:00
Paul Hauner	f85485884f	Process gossip blocks on the GossipProcessor (#1523 ) ## Issue Addressed NA ## Proposed Changes Moves beacon block processing over to the newly-added `GossipProcessor`. This moves the task off the core executor onto the blocking one. ## Additional Info - With this PR, gossip blocks are being ignored during sync.	2020-08-17 09:20:27 +00:00
divma	9ae9df806c	Fix clippy lints rpc (#1401 ) ## Issue Addressed #1388 partially (eth2_libp2p & network) ## Proposed Changes TLDR at the end - Complex types are 3 on the handlers/Behaviours but the types are `Poll<ComplexType>` where `ComplexType` comes from the traits of libp2p. Those, I don't thing are worth an alias. A couple more were from using tokio combinators and were removed writing things the async way and using [`BoxFuture`](https://docs.rs/futures/0.3.5/futures/future/type.BoxFuture.html) - The cognitive complexity.. I tried to address those before (they come from the poll functions too) and tbh they are cognitively simpler to understand the way they are now. Moving separate parts to functions doesn't add much since that code is not repeated and they all do early returns. If moved those returns would now need to be wrapped in an Option, probably, and checked to be returned again. I would leave them like that but that's just preference. - Too many arguments: They are not easily put together in a wrapping struct since the parameters don't relate semantically (Ex: fn new with a log, a reference to the chain, a peer, etc) but some may differ. - Needless returns were indeed needless ## Additional Info TLDR: removed needless return, used BoxFuture and async, left the rest untouched since those lgtm	2020-07-28 01:39:42 +00:00
blacktemplar	23a8f31f83	Fix clippy warnings (#1385 ) ## Issue Addressed NA ## Proposed Changes Fixes most clippy warnings and ignores the rest of them, see issue #1388.	2020-07-23 14:18:00 +00:00
Age Manning	5bc8fea2e0	Activate peer scoring (#1284 ) * Initial score structure * Peer manager update * Updates to dialing * Correct tests * Correct typos and remove unused function * Integrate scoring into the network crate * Clean warnings * Formatting * Shift core functionality into the behaviour * Temp commit * Shift disconnections into the behaviour * Temp commit * Update libp2p and gossipsub * Remove gossipsub lru cache * Correct merge conflicts * Modify handler and correct tests * Update enr network globals on socket update * Apply clippy lints * Add new prysm fingerprint * More clippy fixes	2020-07-07 10:13:16 +10:00
Paul Hauner	e429c3eefe	Remove old block processing shim (#1327 ) * Remove old block processing shim * Run rustfmt * Fix log formatting * Swap peer ids over to display	2020-07-06 16:28:00 +10:00
Michael Sproul	7688b5f1dd	Merge remote-tracking branch 'origin/master' into spec-v0.12	2020-06-26 12:57:56 +10:00
pscott	02174e21d8	Fix clippy's performance lints (#1286 ) * Fix clippy perf lints * Cargo fmt * Add and to lint rule in Makefile * Fix some leftover clippy lints	2020-06-26 00:04:08 +10:00
Michael Sproul	e6f97bf466	Merge remote-tracking branch 'origin/master' into spec-v0.12	2020-06-17 12:34:11 +10:00
Paul Hauner	764cb2d32a	v0.12 fork choice update (#1229 ) * Incomplete scraps * Add progress on new fork choice impl * Further progress * First complete compiling version * Remove chain reference * Add new lmd_ghost crate * Start integrating into beacon chain * Update `milagro_bls` to new release (#1183) * Update milagro_bls to new release Signed-off-by: Kirk Baird <baird.k@outlook.com> * Tidy up fake cryptos Signed-off-by: Kirk Baird <baird.k@outlook.com> * move SecretHash to bls and put plaintext back Signed-off-by: Kirk Baird <baird.k@outlook.com> * Update state processing for v0.12 * Fix EF test runners for v0.12 * Fix some tests * Fix broken attestation verification test * More test fixes * Rough beacon chain impl working * Remove fork_choice_2 * Remove checkpoint manager * Half finished ssz impl * Add missed file * Add persistence * Tidy, fix some compile errors * Remove RwLock from ProtoArrayForkChoice * Fix store-based compile errors * Add comments, tidy * Move function out of ForkChoice struct * Start testing * More testing * Fix compile error * Tidy beacon_chain::fork_choice * Queue attestations from the current slot * Allow fork choice to handle prior-to-genesis start * Improve error granularity * Test attestation dequeuing * Process attestations during block * Store target root in fork choice * Move fork choice verification into new crate * Update tests * Consensus updates for v0.12 (#1228) * Update state processing for v0.12 * Fix EF test runners for v0.12 * Fix some tests * Fix broken attestation verification test * More test fixes * Fix typo found in review * Add `Block` struct to ProtoArray * Start fixing get_ancestor * Add rough progress on testing * Get fork choice tests working * Progress with testing * Fix partialeq impl * Move slot clock from fc_store * Improve testing * Add testing for best justified * Add clone back to SystemTimeSlotClock * Add balances test * Start adding balances cache again * Wire-in balances cache * Improve tests * Remove commented-out tests * Remove beacon_chain::ForkChoice * Rename crates * Update wider codebase to new fork_choice layout * Move advance_slot in test harness * Tidy ForkChoice::update_time * Fix verification tests * Fix compile error with iter::once * Fix fork choice tests * Ensure block attestations are processed * Fix failing beacon_chain tests * Add first invalid block check * Add finalized block check * Progress with testing, new store builder * Add fixes to get_ancestor * Fix old genesis justification test * Fix remaining fork choice tests * Change root iteration method * Move on_verified_block * Remove unused method * Start adding attestation verification tests * Add invalid ffg target test * Add target epoch test * Add queued attestation test * Remove old fork choice verification tests * Tidy, add test * Move fork choice lock drop * Rename BeaconForkChoiceStore * Add comments, tidy BeaconForkChoiceStore * Update metrics, rename fork_choice_store.rs * Remove genesis_block_root from ForkChoice * Tidy * Update fork_choice comments * Tidy, add comments * Tidy, simplify ForkChoice, fix compile issue * Tidy, removed dead file * Increase http request timeout * Fix failing rest_api test * Set HTTP timeout back to 5s * Apply fix to get_ancestor * Address Michael's comments * Fix typo * Revert "Fix broken attestation verification test" This reverts commit 722cdc903b12611de27916a57eeecfa3224f2279. Co-authored-by: Kirk Baird <baird.k@outlook.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2020-06-17 11:10:22 +10:00
Pawan Dhananjay	bb8b88edcf	Use SSZ types in rpc (#1244 ) * Update `milagro_bls` to new release (#1183) * Update milagro_bls to new release Signed-off-by: Kirk Baird <baird.k@outlook.com> * Tidy up fake cryptos Signed-off-by: Kirk Baird <baird.k@outlook.com> * move SecretHash to bls and put plaintext back Signed-off-by: Kirk Baird <baird.k@outlook.com> * Update v0.12.0 to v0.12.1 * Use ssz types for Request and error types * Fix errors * Constrain BlocksByRangeRequest count to MAX_REQUEST_BLOCKS * Fix issues after rebasing * Address review comments Co-authored-by: Kirk Baird <baird.k@outlook.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Age Manning <Age@AgeManning.com>	2020-06-12 10:04:50 +10:00
Age Manning	2dfe77a8f9	Handle syncing edge case (#1258 )	2020-06-11 12:06:42 +10:00
divma	0e37a16927	Super tiny RPC refactor (#1187 ) * wip: mwake the request id optional * make the request_id optional * cleanup * address clippy lints inside rpc * WIP: Separate sent RPC events from received ones * WIP: Separate sent RPC events from received ones * cleanup * Separate request ids from substream ids * Make RPC's message handling independent of RequestIds * Change behaviour RPC events to be more outside-crate friendly * Propage changes across the network + router + processor * Propage changes across the network + router + processor * fmt * "tiny" refactor * more tiny refactors * fmt eth2-libp2p * wip: propagating changes * wip: propagating changes * cleaning up * more cleanup * fmt * tests HOT fix Co-authored-by: Age Manning <Age@AgeManning.com>	2020-06-05 13:07:59 +10:00
Pawan Dhananjay	042e80570c	Improve tokio task execution (#1181 ) * Add logging on shutdown * Replace tokio::spawn with handle.spawn * Upgrade tokio * Add a task executor * Beacon chain tasks use task executor * Validator client tasks use task executor * Rename runtime_handle to executor * Add duration histograms; minor fixes * Cleanup * Fix logs * Fix tests * Remove random file * Get enr dependency instead of libp2p * Address some review comments * Libp2p takes a TaskExecutor * Ugly fix libp2p tests * Move TaskExecutor to own file * Upgrade Dockerfile rust version * Minor fixes * Revert "Ugly fix libp2p tests" This reverts commit 58d4bb690f52de28d893943b7504d2d0c6621429. * Pretty fix libp2p tests * Add spawn_without_exit; change Counter to Gauge * Tidy * Move log from RuntimeContext to TaskExecutor * Fix errors * Replace histogram with int_gauge for async tasks * Fix todo * Fix memory leak in test by exiting all spawned tasks at the end	2020-06-04 21:48:05 +10:00
Age Manning	b6408805a2	Stable futures (#879 ) * Port eth1 lib to use stable futures * Port eth1_test_rig to stable futures * Port eth1 tests to stable futures * Port genesis service to stable futures * Port genesis tests to stable futures * Port beacon_chain to stable futures * Port lcli to stable futures * Fix eth1_test_rig (#1014) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Update hashmap hashset to stable futures * Adds panic test to hashset delay * Port remote_beacon_node to stable futures * Fix lcli merge conflicts * Non rpc stuff compiles * protocol.rs compiles * Port websockets, timer and notifier to stable futures (#1035) * Fix lcli * Port timer to stable futures * Fix timer * Port websocket_server to stable futures * Port notifier to stable futures * Add TODOS * Port remote_beacon_node to stable futures * Partial eth2-libp2p stable future upgrade * Finished first round of fighting RPC types * Further progress towards porting eth2-libp2p adds caching to discovery * Update behaviour * RPC handler to stable futures * Update RPC to master libp2p * Network service additions * Fix the fallback transport construction (#1102) * Correct warning * Remove hashmap delay * Compiling version of eth2-libp2p * Update all crates versions * Fix conversion function and add tests (#1113) * Port validator_client to stable futures (#1114) * Add PH & MS slot clock changes * Account for genesis time * Add progress on duties refactor * Add simple is_aggregator bool to val subscription * Start work on attestation_verification.rs * Add progress on ObservedAttestations * Progress with ObservedAttestations * Fix tests * Add observed attestations to the beacon chain * Add attestation observation to processing code * Add progress on attestation verification * Add first draft of ObservedAttesters * Add more tests * Add observed attesters to beacon chain * Add observers to attestation processing * Add more attestation verification * Create ObservedAggregators map * Remove commented-out code * Add observed aggregators into chain * Add progress * Finish adding features to attestation verification * Ensure beacon chain compiles * Link attn verification into chain * Integrate new attn verification in chain * Remove old attestation processing code * Start trying to fix beacon_chain tests * Split adding into pools into two functions * Add aggregation to harness * Get test harness working again * Adjust the number of aggregators for test harness * Fix edge-case in harness * Integrate new attn processing in network * Fix compile bug in validator_client * Update validator API endpoints * Fix aggreagation in test harness * Fix enum thing * Fix attestation observation bug: * Patch failing API tests * Start adding comments to attestation verification * Remove unused attestation field * Unify "is block known" logic * Update comments * Supress fork choice errors for network processing * Add todos * Tidy * Add gossip attn tests * Disallow test harness to produce old attns * Comment out in-progress tests * Partially address pruning tests * Fix failing store test * Add aggregate tests * Add comments about which spec conditions we check * Dont re-aggregate * Split apart test harness attn production * Fix compile error in network * Make progress on commented-out test * Fix skipping attestation test * Add fork choice verification tests * Tidy attn tests, remove dead code * Remove some accidentally added code * Fix clippy lint * Rename test file * Add block tests, add cheap block proposer check * Rename block testing file * Add observed_block_producers * Tidy * Switch around block signature verification * Finish block testing * Remove gossip from signature tests * First pass of self review * Fix deviation in spec * Update test spec tags * Start moving over to hashset * Finish moving observed attesters to hashmap * Move aggregation pool over to hashmap * Make fc attn borrow again * Fix rest_api compile error * Fix missing comments * Fix monster test * Uncomment increasing slots test * Address remaining comments * Remove unsafe, use cfg test * Remove cfg test flag * Fix dodgy comment * Revert "Update hashmap hashset to stable futures" This reverts commit d432378a3cc5cd67fc29c0b15b96b886c1323554. * Revert "Adds panic test to hashset delay" This reverts commit 281502396fc5b90d9c421a309c2c056982c9525b. * Ported attestation_service * Ported duties_service * Ported fork_service * More ports * Port block_service * Minor fixes * VC compiles * Update TODOS * Borrow self where possible * Ignore aggregates that are already known. * Unify aggregator modulo logic * Fix typo in logs * Refactor validator subscription logic * Avoid reproducing selection proof * Skip HTTP call if no subscriptions * Rename DutyAndState -> DutyAndProof * Tidy logs * Print root as dbg * Fix compile errors in tests * Fix compile error in test * Re-Fix attestation and duties service * Minor fixes Co-authored-by: Paul Hauner <paul@paulhauner.com> * Network crate update to stable futures * Port account_manager to stable futures (#1121) * Port account_manager to stable futures * Run async fns in tokio environment * Port rest_api crate to stable futures (#1118) * Port rest_api lib to stable futures * Reduce tokio features * Update notifier to stable futures * Builder update * Further updates * Convert self referential async functions * stable futures fixes (#1124) * Fix eth1 update functions * Fix genesis and client * Fix beacon node lib * Return appropriate runtimes from environment * Fix test rig * Refactor eth1 service update * Upgrade simulator to stable futures * Lighthouse compiles on stable futures * Remove println debugging statement * Update libp2p service, start rpc test upgrade * Update network crate for new libp2p * Update tokio::codec to futures_codec (#1128) * Further work towards RPC corrections * Correct http timeout and network service select * Use tokio runtime for libp2p * Revert "Update tokio::codec to futures_codec (#1128)" This reverts commit e57aea924acf5cbabdcea18895ac07e38a425ed7. * Upgrade RPC libp2p tests * Upgrade secio fallback test * Upgrade gossipsub examples * Clean up RPC protocol * Test fixes (#1133) * Correct websocket timeout and run on os thread * Fix network test * Clean up PR * Correct tokio tcp move attestation service tests * Upgrade attestation service tests * Correct network test * Correct genesis test * Test corrections * Log info when block is received * Modify logs and update attester service events * Stable futures: fixes to vc, eth1 and account manager (#1142) * Add local testnet scripts * Remove whiteblock script * Rename local testnet script * Move spawns onto handle * Fix VC panic * Initial fix to block production issue * Tidy block producer fix * Tidy further * Add local testnet clean script * Run cargo fmt * Tidy duties service * Tidy fork service * Tidy ForkService * Tidy AttestationService * Tidy notifier * Ensure await is not suppressed in eth1 * Ensure await is not suppressed in account_manager * Use .ok() instead of .unwrap_or(()) * RPC decoding test for proto * Update discv5 and eth2-libp2p deps * Fix lcli double runtime issue (#1144) * Handle stream termination and dialing peer errors * Correct peer_info variant types * Remove unnecessary warnings * Handle subnet unsubscription removal and improve logigng * Add logs around ping * Upgrade discv5 and improve logging * Handle peer connection status for multiple connections * Improve network service logging * Improve logging around peer manager * Upgrade swarm poll centralise peer management * Identify clients on error * Fix `remove_peer` in sync (#1150) * remove_peer removes from all chains * Remove logs * Fix early return from loop * Improved logging, fix panic * Partially correct tests * Stable futures: Vc sync (#1149) * Improve syncing heuristic * Add comments * Use safer method for tolerance * Fix tests * Stable futures: Fix VC bug, update agg pool, add more metrics (#1151) * Expose epoch processing summary * Expose participation metrics to prometheus * Switch to f64 * Reduce precision * Change precision * Expose observed attesters metrics * Add metrics for agg/unagg attn counts * Add metrics for gossip rx * Add metrics for gossip tx * Adds ignored attns to prom * Add attestation timing * Add timer for aggregation pool sig agg * Add write lock timer for agg pool * Add more metrics to agg pool * Change map lock code * Add extra metric to agg pool * Change lock handling in agg pool * Change .write() to .read() * Add another agg pool timer * Fix for is_aggregator * Fix pruning bug Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-05-17 11:16:48 +00:00
Age Manning	500f6b53d1	Testnet corrections (#1050 ) * Correct RPC ping request * Add attestation verification * Add discv5 bug fixes * Reduce gossipsub heartbeat and update metadata * Handle known chain of advanced peer	2020-04-27 14:18:30 +10:00
Age Manning	79cc9473c1	Sync and multi-client updates (#1044 ) * Update finalized/head sync logic * Correct sync logging * Handle status during sync gracefully	2020-04-23 19:01:29 +10:00
Age Manning	0b82e9f8a9	Update Syncing logic (#1042 ) * Prevent duplicate parent block lookups * Updates logic for handling re-status'd peers * Allow block lookup if the block is close to head * Correct ordering of sync logs * Remove comments in block processer, clean up sim	2020-04-22 23:58:10 +10:00
divma	2469bde6b1	Add chain_id in range syncing to avoid wrong dispatching of batch results (#1037 )	2020-04-22 21:17:56 +10:00

1 2

80 Commits