lighthouse

Author	SHA1	Message	Date
realbigsean	45897ad4e1	remove blob wrapper	2022-11-19 15:18:42 -05:00
realbigsean	7162e5e23b	add a bunch of blob coupling boiler plate, add a blobs by root request	2022-11-15 16:43:56 -05:00
realbigsean	fe04d945cc	make signed block + sidecar consensus spec	2022-11-10 14:22:30 -05:00
Pawan Dhananjay	29f2ec46d3	Couple blocks and blobs in gossip (#3670 ) * Revert "Add more gossip verification conditions" This reverts commit `1430b561c3`. * Revert "Add todos" This reverts commit `91efb9d4c7`. * Revert "Reprocess blob sidecar messages" This reverts commit `21bf3d37cd`. * Add the coupled topic * Decode SignedBeaconBlockAndBlobsSidecar correctly * Process Block and Blobs in beacon processor * Remove extra blob publishing logic from vc * Remove blob signing in vc * Ugly hack to compile	2022-11-01 10:28:21 -04:00
realbigsean	c0dc42ea07	cargo fmt	2022-10-04 08:21:46 -04:00
realbigsean	8d45e48775	cargo fix	2022-10-03 21:52:16 -04:00
realbigsean	e81dbbfea4	compile	2022-10-03 21:48:02 -04:00
realbigsean	88006735c4	compile	2022-10-03 10:06:04 -04:00
realbigsean	fe6fc55449	fix compilation errors, rename capella -> shanghai, cleanup some rebase issues	2022-09-29 12:43:13 -04:00
realbigsean	3f1e5cee78	Some gossip work	2022-09-29 12:35:53 -04:00
realbigsean	ebc0ccd02a	some more sync boilerplate	2022-09-29 12:34:09 -04:00
realbigsean	4008da6c60	sync tx blobs	2022-09-29 12:32:55 -04:00
Marius van der Wijden	f9209e2d08	more network stuff	2022-09-17 16:39:40 +02:00
Marius van der Wijden	aeb52ff186	network stuff	2022-09-17 16:10:42 +02:00
Daniel Knopik	292a16a6eb	gossip boilerplate	2022-09-17 14:58:27 +02:00
Akihito Nakano	1cc8a97d4e	Remove unused method in HandlerNetworkContext (#3299 ) ## Issue Addressed N/A ## Proposed Changes Removed unused method in `HandlerNetworkContext`.	2022-07-04 02:56:14 +00:00
Paul Hauner	be4e261e74	Use async code when interacting with EL (#3244 ) ## Overview This rather extensive PR achieves two primary goals: 1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state. 2. Refactors fork choice, block production and block processing to `async` functions. Additionally, it achieves: - Concurrent forkchoice updates to the EL and cache pruning after a new head is selected. - Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production. - Concurrent per-block-processing and execution payload verification during block processing. - The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?): - I had to do this to deal with sending blocks into spawned tasks. - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones. - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap. - Avoids cloning all the blocks in every chain segment during sync. - It also has the potential to clean up our code where we need to pass an owned block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅) - The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs. For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273 ## Changes to `canonical_head` and `fork_choice` Previously, the `BeaconChain` had two separate fields: ``` canonical_head: RwLock<Snapshot>, fork_choice: RwLock<BeaconForkChoice> ``` Now, we have grouped these values under a single struct: ``` canonical_head: CanonicalHead { cached_head: RwLock<Arc<Snapshot>>, fork_choice: RwLock<BeaconForkChoice> } ``` Apart from ergonomics, the only actual change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously. ## Breaking Changes ### The `state` (root) field in the `finalized_checkpoint` SSE event Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event: 1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`. 4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots. Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](`de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)`) it uses [`getStateRootFromBlockRoot`](`de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)`) which uses (1). I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku. ## Notes for Reviewers I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct. I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking". I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run at least once means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows when it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it. I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around. Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2. You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests: - Changing tests to be `tokio::async` tests. - Adding `.await` to fork choice, block processing and block production functions. - Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`. - Wrapping `SignedBeaconBlock` in an `Arc`. - In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant. I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic. Co-authored-by: Mac L <mjladson@pm.me>	2022-07-03 05:36:50 +00:00
Divma	4bf1af4e85	Custom RPC request management for sync (#3029 ) ## Proposed Changes Make `lighthouse_network` generic over request ids, now usable by sync	2022-03-02 22:07:17 +00:00
Pawan Dhananjay	88063398f6	Prevent double import of blocks (#2647 ) ## Issue Addressed Resolves #2611 ## Proposed Changes Adds a duplicate block root cache to the `BeaconProcessor`. Adds the block root to the cache before calling `process_gossip_block` and `process_rpc_block`. Since `process_rpc_block` is called only for single block lookups, we don't have to worry about batched block imports. The block is imported from the source(gossip/rpc) that arrives first. The block that arrives second is not imported to avoid the db access issue. There are 2 cases: 1. Block that arrives second is from rpc: In this case, we return an optimistic `BlockError::BlockIsAlreadyKnown` to sync. 2. Block that arrives second is from gossip: In this case, we only do gossip verification and forwarding but don't import the block into the the beacon chain. ## Additional info Splits up `process_gossip_block` function to `process_gossip_unverified_block` and `process_gossip_verified_block`.	2021-10-28 03:36:14 +00:00
Age Manning	df40700ddd	Rename eth2_libp2p to lighthouse_network (#2702 ) ## Description The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration. As of this writing it currently houses the following high-level lighthouse-specific logic: - Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings - Integration and handling of ENRs with respect to libp2p and eth2 - Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers. - Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc. - Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api. - Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks. - Lighthouse specific types for managing gossipsub topics, sync status and ENR fields - Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring - Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p. Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network` Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-10-19 00:30:39 +00:00
Mac L	4c510f8f6b	Add `BlockTimesCache` to allow additional block delay metrics (#2546 ) ## Issue Addressed Closes #2528 ## Proposed Changes - Add `BlockTimesCache` to provide block timing information to `BeaconChain`. This allows additional metrics to be calculated for blocks that are set as head too late. - Thread the `seen_timestamp` of blocks received from RPC responses (except blocks from syncing) through to the sync manager, similar to what is done for blocks from gossip. ## Additional Info This provides the following additional metrics: - `BEACON_BLOCK_OBSERVED_SLOT_START_DELAY_TIME` - The delay between the start of the slot and when the block was first observed. - `BEACON_BLOCK_IMPORTED_OBSERVED_DELAY_TIME` - The delay between when the block was first observed and when the block was imported. - `BEACON_BLOCK_HEAD_IMPORTED_DELAY_TIME` - The delay between when the block was imported and when the block was set as head. The metric `BEACON_BLOCK_IMPORTED_SLOT_START_DELAY_TIME` was removed. A log is produced when a block is set as head too late, e.g.: ``` Aug 27 03:46:39.006 DEBG Delayed head block set_as_head_delay: Some(21.731066ms), imported_delay: Some(119.929934ms), observed_delay: Some(3.864596988s), block_delay: 4.006257988s, slot: 1931331, proposer_index: 24294, block_root: 0x937602c89d3143afa89088a44bdf4b4d0d760dad082abacb229495c048648a9e, service: beacon ```	2021-09-30 04:31:41 +00:00
Michael Sproul	9667dc2f03	Implement checkpoint sync (#2244 ) ## Issue Addressed Closes #1891 Closes #1784 ## Proposed Changes Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint. ## Additional Info - [x] Return unavailable status for out-of-range blocks requested by peers (#2561) - [x] Implement sync daemon for fetching historical blocks (#2561) - [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module) - [x] Consistency check for initial block + state - [x] Fetch the initial state and block from a beacon node HTTP endpoint - [x] Don't crash fetching beacon states by slot from the API - [x] Background service for state reconstruction, triggered by CLI flag or API call. Considered out of scope for this PR: - Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification) Co-authored-by: Diva M <divma@protonmail.com>	2021-09-22 00:37:28 +00:00
Pawan Dhananjay	e8c0d1f19b	Altair networking (#2300 ) ## Issue Addressed Resolves #2278 ## Proposed Changes Implements the networking components for the Altair hard fork https://github.com/ethereum/eth2.0-specs/blob/dev/specs/altair/p2p-interface.md ## Additional Info This PR acts as the base branch for networking changes and tracks https://github.com/sigp/lighthouse/pull/2279 . Changes to gossip, rpc and discovery can be separate PRs to be merged here for ease of review. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-08-04 01:44:57 +00:00
Paul Hauner	a764c3b247	Handle early blocks (#2155 ) ## Issue Addressed NA ## Problem this PR addresses There's an issue where Lighthouse is banning a lot of peers due to the following sequence of events: 1. Gossip block 0xabc arrives ~200ms early - It is propagated across the network, with respect to [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#why-is-there-maximum_gossip_clock_disparity-when-validating-slot-ranges-of-messages-in-gossip-subnets). - However, it is not imported to our database since the block is early. 2. Attestations for 0xabc arrive, but the block was not imported. - The peer that sent the attestation is down-voted. - Each unknown-block attestation causes a score loss of 1, the peer is banned at -100. - When the peer is on an attestation subnet there can be hundreds of attestations, so the peer is banned quickly (before the missed block can be obtained via rpc). ## Potential solutions I can think of three solutions to this: 1. Wait for attestation-queuing (#635) to arrive and solve this. - Easy - Not immediate fix. - Whilst this would work, I don't think it's a perfect solution for this particular issue, rather (3) is better. 1. Allow importing blocks with a tolerance of `MAXIMUM_GOSSIP_CLOCK_DISPARITY`. - Easy - ~~I have implemented this, for now.~~ 1. If a block is verified for gossip propagation (i.e., signature verified) and it's within `MAXIMUM_GOSSIP_CLOCK_DISPARITY`, then queue it to be processed at the start of the appropriate slot. - More difficult - Feels like the best solution, I will try to implement this. This PR takes approach (3). ## Changes included - Implement the `block_delay_queue`, based upon a [`DelayQueue`](https://docs.rs/tokio-util/0.6.3/tokio_util/time/delay_queue/struct.DelayQueue.html) which can store blocks until it's time to import them. - Add a new `DelayedImportBlock` variant to the `beacon_processor::WorkEvent` enum to handle this new event. - In the `BeaconProcessor`, refactor a `tokio::select!` to a struct with an explicit `Stream` implementation. I experienced some issues with `tokio::select!` in the block delay queue and I also found it hard to debug. I think this explicit implementation is nicer and functionally equivalent (apart from the fact that `tokio::select!` randomly chooses futures to poll, whereas now we're deterministic). - Add a testing framework to the `beacon_processor` module that tests this new block delay logic. I also tested a handful of other operations in the beacon processor (attns, slashings, exits) since it was super easy to copy-pasta the code from the `http_api` tester. - To implement these tests I added the concept of an optional `work_journal_tx` to the `BeaconProcessor` which will spit out a log of events. I used this in the tests to ensure that things were happening as I expect. - The tests are a little racey, but it's hard to avoid that when testing timing-based code. If we see CI failures I can revise. I haven't observed any failures due to races on my machine or on CI yet. - To assist with testing I allowed for directly setting the time on the `ManualSlotClock`. - I gave the `beacon_processor::Worker` a `Toolbox` for two reasons; (a) it avoids changing tons of function sigs when you want to pass a new object to the worker and (b) it seemed cute.	2021-02-24 03:08:52 +00:00
Paul Hauner	2b2a358522	Detailed validator monitoring (#2151 ) ## Issue Addressed - Resolves #2064 ## Proposed Changes Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators. Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint. Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored. See the new docs included in this PR for more info. ## TODO - [x] Track validator balance, `slashed` status, etc. - [x] ~~Register slashings in current epoch, not offense epoch~~ - [ ] Publish Grafana dashboard, update TODO link in docs - [x] ~~#2130 is merged into this branch, resolve that~~	2021-01-20 19:19:38 +00:00
Age Manning	2931b05582	Update libp2p (#2101 ) This is a little bit of a tip-of-the-iceberg PR. It houses a lot of code changes in the libp2p dependency. This needs a bit of thorough testing before merging. The primary code changes are: - General libp2p dependency update - Gossipsub refactor to shift compression into gossipsub providing performance improvements and improved API for handling compression Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-12-23 07:53:36 +00:00
divma	f3200784b4	More metrics + RPC tweaks (#2041 ) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)	2020-12-08 03:55:50 +00:00
divma	8fcd22992c	No string in slog (#2017 ) ## Issue Addressed Following slog's documentation, this should help a bit with string allocations. I left it run for two days and mem usage is lower. This is of course anecdotal, but shouldn't harm anyway ## Proposed Changes remove `String` creation in logs when possible	2020-11-30 10:33:00 +00:00
divma	d727e55abe	Move some rpc processing to the beacon_processor (#1936 ) ## Issue Addressed `BlocksByRange` requests were the main culprit of a series of timeouts to peer's requests in general because they produce build up in the router's processor. Those were moved to the blocking executor but a task is being spawned for each; also not ideal since the amount of resources we give to those is not controlled ## Proposed Changes - Move `BlocksByRange` and `BlocksByRoots` to the `beacon_processor`. The processor crafts the responses and sends them. - Move too the processing of `StatusMessage`s from other peers. This is a fast operation but it can also build up and won't scale if we keep it in the router (processing one at the time). These don't need to send an answer, so there is no harm in processing them "later" if that were to happen. Sending responses to status requests is still in the router, so we answer as soon as we see them. - Some "extras" that are basically clean up: - Split the `Worker` logic in sync methods (chain processing and rpc blocks), gossip methods (the majority of methods) and rpc methods (the new ones) - Move the `status_message` function previously provided by the router's processor to a more central place since it is used by the router, sync, network_context and beacon_processor - Some spelling ## Additional Info What's left to decide/test more thoroughly is the length of the queues and the priority rules. @paulhauner suggested at some point to put status above attestations, and @AgeManning had described an importance of "protecting gossipsub" so my solution is leaving status requests in the router and RPC methods below attestations. Slashings and Exits are at the end.	2020-11-19 23:33:44 +00:00
Age Manning	49c4630045	Performance improvement for db reads (#1909 ) This PR adds a number of improvements: - Downgrade a warning log when we ignore blocks for gossipsub processing - Revert a a correction to improve logging of peer score changes - Shift syncing DB reads off the core-executor allowing parallel processing of large sync messages - Correct the timeout logic of RPC chunk sends, giving more time before timing out RPC outbound messages.	2020-11-16 07:28:30 +00:00
divma	eb56140582	Update logs + do not downscore peers if WE time out (#1901 ) ## Issue Addressed - RPC Errors were being logged twice: first in the peer manager and then again in the router, so leave just the peer manager's one - The "reduce peer count" warn message gets thrown to the user for every missed chunk, so instead print it when the request times out and also do not include there info that is not relevant to the user - The processor didn't have the service tag so add it - Impl `KV` for status message - Do not downscore peers if we are the ones that timed out Other small improvements	2020-11-16 04:06:14 +00:00
divma	8a16548715	Misc Peer sync info adjustments (#1896 ) ## Issue Addressed #1856 ## Proposed Changes - For clarity, the router's processor now only decides if a peer is compatible and it disconnects it or sends it to sync accordingly. No logic here regarding how useful is the peer. - Update peer_sync_info's rules - Add an `IrrelevantPeer` sync status to account for incompatible peers (maybe this should be "IncompatiblePeer" now that I think about it?) this state is update upon receiving an internal goodbye in the peer manager - Misc code cleanups - Reduce the need to create `StatusMessage`s (and thus, `Arc` accesses ) - Add missing calls to update the global sync state The overall effect should be: - More peers recognized as Behind, and less as Unknown - Peers identified as incompatible	2020-11-13 09:00:10 +00:00
divma	668513b67e	Sync state adjustments (#1804 ) check for advanced peers and the state of the chain wrt the clock slot to decide if a chain is or not synced /transitioning to a head sync. Also a fix that prevented getting the right state while syncing heads	2020-10-22 00:26:06 +00:00
Age Manning	240181e840	Upgrade discovery and restructure task execution (#1693 ) * Initial rebase * Remove old code * Correct release tests * Rebase commit * Remove eth2-testnet dep on eth2libp2p * Remove crates lost in rebase * Remove unused dep	2020-10-05 18:45:54 +11:00
Age Manning	bcb629564a	Improve error handling in network processing (#1654 ) * Improve error handling in network processing * Cargo fmt * Cargo fmt * Improve error handling for prior genesis * Remove dep	2020-10-05 17:34:56 +11:00
divma	b8013b7b2c	Super Silky Smooth Syncs, like a Sir (#1628 ) ## Issue Addressed In principle.. closes #1551 but in general are improvements for performance, maintainability and readability. The logic for the optimistic sync in actually simple ## Proposed Changes There are miscellaneous things here: - Remove unnecessary `BatchProcessResult::Partial` to simplify the batch validation logic - Make batches a state machine. This is done to ensure batch state transitions respect our logic (this was previously done by moving batches between `Vec`s) and to ease the cognitive load of the `SyncingChain` struct - Move most batch-related logic to the batch - Remove `PendingBatches` in favor of a map of peers to their batches. This is to avoid duplicating peers inside the chain (peer_pool and pending_batches) - Add `must_use` decoration to the `ProcessingResult` so that chains that request to be removed are handled accordingly. This also means that chains are now removed in more places than before to account for unhandled cases - Store batches in a sorted map (`BTreeMap`) access is not O(1) but since the number of _active_ batches is bounded this should be fast, and saves performing hashing ops. Batches are indexed by the epoch they start. Sorted, to easily handle chain advancements (range logic) - Produce the chain Id from the identifying fields: target root and target slot. This, to guarantee there can't be duplicated chains and be able to consistently search chains by either Id or checkpoint - Fix chain_id not being present in all chain loggers - Handle mega-edge case where the processor's work queue is full and the batch can't be sent. In this case the chain would lose the blocks, remain in a "syncing" state and waiting for a result that won't arrive, effectively stalling sync. - When a batch imports blocks or the chain starts syncing with a local finalized epoch greater that the chain's start epoch, the chain is advanced instead of reset. This is to avoid losing download progress and validate batches faster. This also means that the old `start_epoch` now means "current first unvalidated batch", so it represents more accurately the progress of the chain. - Batch status peers from the same chain to reduce Arc access. - Handle a couple of cases where the retry counters for a batch were not updated/checked are now handled via the batch state machine. Basically now if we forget to do it, we will know. - Do not send back the blocks from the processor to the batch. Instead register the attempt before sending the blocks (does not count as failed) - When re-requesting a batch, try to avoid not only the last failed peer, but all previous failed peers. - Optimize requesting batches ahead in the buffer by shuffling idle peers just once (this is just addressing a couple of old TODOs in the code) - In chain_collection, store chains by their id in a map - Include a mapping from request_ids to (chain, batch) that requested the batch to avoid the double O(n) search on block responses - Other stuff: - impl `slog::KV` for batches - impl `slog::KV` for syncing chains - PSA: when logging, we can use `%thing` if `thing` implements `Display`. Same for `?` and `Debug` ### Optimistic syncing: Try first the batch that contains the current head, if the batch imports any block, advance the chain. If not, if this optimistic batch is inside the current processing window leave it there for future use, if not drop it. The tolerance for this block is the same for downloading, but just once for processing Co-authored-by: Age Manning <Age@AgeManning.com>	2020-09-23 06:29:55 +00:00
Age Manning	c6abc56113	Prevent large step-size parameters (#1583 ) ## Issue Addressed Malicious users could request very large block ranges, more than we expect. Although technically legal, we are now quadraticaly weighting large step sizes in the filter. Therefore users may request large skips, but not a large number of blocks, to prevent requests forcing us to do long chain lookups. ## Proposed Changes Weight the step parameter in the RPC filter and prevent any overflows that effect us in the step parameter. ## Additional Info	2020-09-11 02:33:36 +00:00
blacktemplar	c18d37c202	Use Gossipsub 1.1 (#1516 ) ## Issue Addressed #1172 ## Proposed Changes * updates the libp2p dependency * small adaptions based on changes in libp2p * report not just valid messages but also invalid and distinguish between `IGNORE`d messages and `REJECT`ed messages Co-authored-by: Age Manning <Age@AgeManning.com>	2020-08-30 13:06:50 +00:00
Paul Hauner	8e7dd7b2b1	Add remaining network ops to queuing system (#1546 ) ## Issue Addressed NA ## Proposed Changes - Refactors the `BeaconProcessor` to remove some excessive nesting and file bloat - Sorry about the noise from this, it's all contained in 4d3f8c5 though. - Adds exits, proposer slashings, attester slashings to the `BeaconProcessor` so we don't get overwhelmed with large amounts of slashings (which happened a few hours ago). ## Additional Info NA	2020-08-19 05:09:53 +00:00
Paul Hauner	f85485884f	Process gossip blocks on the GossipProcessor (#1523 ) ## Issue Addressed NA ## Proposed Changes Moves beacon block processing over to the newly-added `GossipProcessor`. This moves the task off the core executor onto the blocking one. ## Additional Info - With this PR, gossip blocks are being ignored during sync.	2020-08-17 09:20:27 +00:00
Paul Hauner	b0a3731fff	Introduce a queue for attestations from the network (#1511 ) ## Issue Addressed N/A ## Proposed Changes Introduces the `GossipProcessor`, a multi-threaded (multi-tasked?), non-blocking processor for some messages from the network which require verification and import into the `BeaconChain`. Initial testing indicates that this massively improves system stability by (a) moving block tasks from the normal executor (b) spreading out attestation load. ## Additional Info TBC	2020-08-14 04:38:45 +00:00
divma	138c0cf7f0	Remove block clone (#1448 ) ## Issue Addressed #1028 A bit late, but I think if `BlockError` had a kind (the current `BlockError` minus everything on the variants that comes directly from the block) and the original block, more clones could be removed	2020-08-06 04:29:17 +00:00
Paul Hauner	5629126f45	Add reason to invalid attestation log (#1460 ) ## Issue Addressed NA ## Proposed Changes Adds an extra field to a debug log so we can see why an attestation was invalid. ## Additional Info NA	2020-08-05 01:49:52 +00:00
Age Manning	5bc8fea2e0	Activate peer scoring (#1284 ) * Initial score structure * Peer manager update * Updates to dialing * Correct tests * Correct typos and remove unused function * Integrate scoring into the network crate * Clean warnings * Formatting * Shift core functionality into the behaviour * Temp commit * Shift disconnections into the behaviour * Temp commit * Update libp2p and gossipsub * Remove gossipsub lru cache * Correct merge conflicts * Modify handler and correct tests * Update enr network globals on socket update * Apply clippy lints * Add new prysm fingerprint * More clippy fixes	2020-07-07 10:13:16 +10:00
Paul Hauner	e429c3eefe	Remove old block processing shim (#1327 ) * Remove old block processing shim * Run rustfmt * Fix log formatting * Swap peer ids over to display	2020-07-06 16:28:00 +10:00
Michael Sproul	9450a0f30d	Merge remote-tracking branch 'origin/master' into spec-v0.12	2020-06-18 21:59:59 +10:00
Michael Sproul	bcb6afa0aa	Process exits and slashings off the network (#1253 ) * Process exits and slashings off the network * Fix rest_api tests * Add op verification tests * Add tests for pruning of slashings in the op pool * Address Paul's review comments	2020-06-18 21:06:34 +10:00
Pawan Dhananjay	3199b1a6f2	Use all attestation subnets (#1257 ) * Update `milagro_bls` to new release (#1183) * Update milagro_bls to new release Signed-off-by: Kirk Baird <baird.k@outlook.com> * Tidy up fake cryptos Signed-off-by: Kirk Baird <baird.k@outlook.com> * move SecretHash to bls and put plaintext back Signed-off-by: Kirk Baird <baird.k@outlook.com> * Update v0.12.0 to v0.12.1 * Add compute_subnet_for_attestation * Replace CommitteeIndex topic with Attestation * Fix warnings * Fix attestation service tests * fmt * Appease clippy * return error from validator_subscriptions * move state out of loop * Fix early break on error * Get state from slot clock * Fix beacon state in attestation tests * Add failing test for lookahead > 1 * Minor change * Address some review comments * Add subnet verification to beacon chain * Move subnet verification to processor * Pass committee_count_at_slot to ValidatorDuty and ValidatorSubscription * Pass subnet id for publishing attestations * Fix attestation service tests * Fix more tests * Fix fork choice test * Remove unused code * Remove more unused and expensive code Co-authored-by: Kirk Baird <baird.k@outlook.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Age Manning <Age@AgeManning.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-06-18 19:11:03 +10:00
divma	065251b701	Add DC/Shutdown capabilities to the behaviour handler (#1233 ) * Remove ban event from the PM * Fix dispatching of responses to peer's requests * Disconnection logic	2020-06-18 11:53:08 +10:00
Michael Sproul	e6f97bf466	Merge remote-tracking branch 'origin/master' into spec-v0.12	2020-06-17 12:34:11 +10:00

1 2

74 Commits