lighthouse

Author	SHA1	Message	Date
Age Manning	c49dd94e20	Update to latest libp2p (#1810 ) ## Description Updates to the latest libp2p and includes gossipsub updates. Of particular note is the limitation of a single topic per gossipsub message. Co-authored-by: blacktemplar <blacktemplar@a1.net>	2020-10-23 03:01:31 +00:00
divma	668513b67e	Sync state adjustments (#1804 ) check for advanced peers and the state of the chain wrt the clock slot to decide if a chain is or not synced /transitioning to a head sync. Also a fix that prevented getting the right state while syncing heads	2020-10-22 00:26:06 +00:00
divma	2acf75785c	More sync updates (#1791 ) ## Issue Addressed #1614 and a couple of sync-stalling problems, the most important is a cyclic dependency between the sync manager and the peer manager	2020-10-20 22:34:18 +00:00
Michael Sproul	703c33bdc7	Fix head tracker concurrency bugs (#1771 ) ## Issue Addressed Closes #1557 ## Proposed Changes Modify the pruning algorithm so that it mutates the head-tracker _before_ committing the database transaction to disk, and _only if_ all the heads to be removed are still present in the head-tracker (i.e. no concurrent mutations). In the process of writing and testing this I also had to make a few other changes: * Use internal mutability for all `BeaconChainHarness` functions (namely the RNG and the graffiti), in order to enable parallel calls (see testing section below). * Disable logging in harness tests unless the `test_logger` feature is turned on And chose to make some clean-ups: * Delete the `NullMigrator` * Remove type-based configuration for the migrator in favour of runtime config (simpler, less duplicated code) * Use the non-blocking migrator unless the blocking migrator is required. In the store tests we need the blocking migrator because some tests make asserts about the state of the DB after the migration has run. * Rename `validators_keypairs` -> `validator_keypairs` in the `BeaconChainHarness` ## Testing To confirm that the fix worked, I wrote a test using [Hiatus](https://crates.io/crates/hiatus), which can be found here: https://github.com/michaelsproul/lighthouse/tree/hiatus-issue-1557 That test can't be merged because it inserts random breakpoints everywhere, but if you check out that branch you can run the test with: ``` $ cd beacon_node/beacon_chain $ cargo test --release --test parallel_tests --features test_logger ``` It should pass, and the log output should show: ``` WARN Pruning deferred because of a concurrent mutation, message: this is expected only very rarely! ``` ## Additional Info This is a backwards-compatible change with no impact on consensus.	2020-10-19 05:58:39 +00:00
Pawan Dhananjay	97be2ca295	Simulator and attestation service fixes (#1747 ) ## Issue Addressed #1729 #1730 Which issue # does this PR address? ## Proposed Changes 1. Fixes a bug in the simulator where nodes can't find each other due to 0 udp ports in their enr. 2. Fixes bugs in attestation service where we are unsubscribing from a subnet prematurely. More testing is needed for attestation service fixes.	2020-10-15 07:11:31 +00:00
blacktemplar	8248afa793	Updates the message-id according to the Networking Spec (#1752 ) ## Proposed Changes Implement the new message id function (see https://github.com/ethereum/eth2.0-specs/pull/2089) using an additional fast message id function for better performance + caching decompressed data.	2020-10-14 06:51:58 +00:00
Paul Hauner	0e4cc50262	Remove unused deps	2020-10-09 15:58:20 +11:00
Paul Hauner	db3e0578e9	Merge branch 'v0.3.0-staging' into v3-master	2020-10-09 15:27:08 +11:00
Paul Hauner	da44821e39	Clean up obsolete TODOs (#1734 ) Squashed commit of the following: commit f99373cbaec9adb2bdbae3f7e903284327962083 Author: Age Manning <Age@AgeManning.com> Date: Mon Oct 5 18:44:09 2020 +1100 Clean up obsolute TODOs	2020-10-05 21:08:14 +11:00
Paul Hauner	ee7c8a0b7e	Update external deps (#1711 ) ## Issue Addressed - Resolves #1706 ## Proposed Changes Updates dependencies across the workspace. Any crate that was not able to be brought to the latest version is listed in #1712. ## Additional Info NA	2020-10-05 08:22:19 +00:00
Age Manning	240181e840	Upgrade discovery and restructure task execution (#1693 ) * Initial rebase * Remove old code * Correct release tests * Rebase commit * Remove eth2-testnet dep on eth2libp2p * Remove crates lost in rebase * Remove unused dep	2020-10-05 18:45:54 +11:00
Age Manning	bcb629564a	Improve error handling in network processing (#1654 ) * Improve error handling in network processing * Cargo fmt * Cargo fmt * Improve error handling for prior genesis * Remove dep	2020-10-05 17:34:56 +11:00
divma	113758a4f5	From panic to crit (#1726 ) ## Issue Addressed Downgrade inconsistent chain segment states from `panic` to `crit`. I don't love this solution but since range can always bounce back from any of those, we don't panic. Co-authored-by: Age Manning <Age@AgeManning.com>	2020-10-05 17:34:49 +11:00
divma	6997776494	Sync fixes (#1716 ) ## Issue Addressed chain state inconsistencies ## Proposed Changes - a batch can be fake-failed by Range if it needs to move a peer to another chain. The peer will still send blocks/ errors / produce timeouts for those requests, so check when we get a response from the RPC that the request id matches, instead of only the peer, since a re-request can be directed to the same peer. - if an optimistic batch succeeds, store the attempt to avoid trying it again when quickly switching chains. Also, use it only if ahead of our current target, instead of the segment's start epoch	2020-10-05 17:33:36 +11:00
Paul Hauner	e7eb99cb5e	Use Drop impl to send worker idle message (#1718 ) ## Issue Addressed NA ## Proposed Changes Uses a `Drop` implementation to help ensure that `BeaconProcessor` workers are freed. This will help prevent against regression, if someone happens to add an early return and it will also help in the case of a panic. ## Additional Info NA	2020-10-05 17:33:25 +11:00
Age Manning	fe07a3c21c	Improve error handling in network processing (#1654 ) * Improve error handling in network processing * Cargo fmt * Cargo fmt * Improve error handling for prior genesis * Remove dep	2020-10-05 17:30:43 +11:00
divma	b1c121b880	From panic to crit (#1726 ) ## Issue Addressed Downgrade inconsistent chain segment states from `panic` to `crit`. I don't love this solution but since range can always bounce back from any of those, we don't panic. Co-authored-by: Age Manning <Age@AgeManning.com>	2020-10-05 04:02:09 +00:00
divma	86a18e72c4	Sync fixes (#1716 ) ## Issue Addressed chain state inconsistencies ## Proposed Changes - a batch can be fake-failed by Range if it needs to move a peer to another chain. The peer will still send blocks/ errors / produce timeouts for those requests, so check when we get a response from the RPC that the request id matches, instead of only the peer, since a re-request can be directed to the same peer. - if an optimistic batch succeeds, store the attempt to avoid trying it again when quickly switching chains. Also, use it only if ahead of our current target, instead of the segment's start epoch	2020-10-04 23:49:14 +00:00
Paul Hauner	d72c026d32	Use Drop impl to send worker idle message (#1718 ) ## Issue Addressed NA ## Proposed Changes Uses a `Drop` implementation to help ensure that `BeaconProcessor` workers are freed. This will help prevent against regression, if someone happens to add an early return and it will also help in the case of a panic. ## Additional Info NA	2020-10-04 21:59:20 +00:00
Sean	6af3bc9ce2	Add UPnP support for Lighthouse (#1587 ) This commit was modified by Paul H whilst rebasing master onto v0.3.0-staging Adding UPnP support will help grow the DHT by allowing NAT traversal for peers with UPnP supported routers. Using IGD library: https://docs.rs/igd/0.10.0/igd/ Adding the the libp2p tcp port and discovery udp port. If this fails it simply logs the attempt and moves on Co-authored-by: Age Manning <Age@AgeManning.com>	2020-10-03 10:07:47 +10:00
realbigsean	255cc25623	Weak subjectivity start from genesis (#1675 ) This commit was edited by Paul H when rebasing from master to v0.3.0-staging. Solution 2 proposed here: https://github.com/sigp/lighthouse/issues/1435#issuecomment-692317639 - Adds an optional `--wss-checkpoint` flag that takes a string `root:epoch` - Verify that the given checkpoint exists in the chain, or that the the chain syncs through this checkpoint. If not, shutdown and prompt the user to purge state before restarting. Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-10-03 10:00:28 +10:00
Sean	94b17ce02b	Add UPnP support for Lighthouse (#1587 ) Adding UPnP support will help grow the DHT by allowing NAT traversal for peers with UPnP supported routers. ## Issue Addressed #927 ## Proposed Changes Using IGD library: https://docs.rs/igd/0.10.0/igd/ Adding the the libp2p tcp port and discovery udp port. If this fails it simply logs the attempt and moves on ## Additional Info Co-authored-by: Age Manning <Age@AgeManning.com>	2020-10-02 08:47:00 +00:00
realbigsean	9d2d6239cd	Weak subjectivity start from genesis (#1675 ) ## Issue Addressed Solution 2 proposed here: https://github.com/sigp/lighthouse/issues/1435#issuecomment-692317639 ## Proposed Changes - Adds an optional `--wss-checkpoint` flag that takes a string `root:epoch` - Verify that the given checkpoint exists in the chain, or that the the chain syncs through this checkpoint. If not, shutdown and prompt the user to purge state before restarting. ## Additional Info Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-10-01 01:41:58 +00:00
Michael Sproul	22aedda1be	Add database schema versioning (#1688 ) ## Issue Addressed Closes #673 ## Proposed Changes Store a schema version in the database so that future releases can check they're running against a compatible database version. This would also enable automatic migration on breaking database changes, but that's left as future work. The database config is also stored in the database so that the `slots_per_restore_point` value can be checked for consistency, which closes #673	2020-10-01 11:12:36 +10:00
Paul Hauner	cdec3cec18	Implement standard eth2.0 API (#1569 ) - Resolves #1550 - Resolves #824 - Resolves #825 - Resolves #1131 - Resolves #1411 - Resolves #1256 - Resolve #1177 - Includes the `ShufflingId` struct initially defined in #1492. That PR is now closed and the changes are included here, with significant bug fixes. - Implement the https://github.com/ethereum/eth2.0-APIs in a new `http_api` crate using `warp`. This replaces the `rest_api` crate. - Add a new `common/eth2` crate which provides a wrapper around `reqwest`, providing the HTTP client that is used by the validator client and for testing. This replaces the `common/remote_beacon_node` crate. - Create a `http_metrics` crate which is a dedicated server for Prometheus metrics (they are no longer served on the same port as the REST API). We now have flags for `--metrics`, `--metrics-address`, etc. - Allow the `subnet_id` to be an optional parameter for `VerifiedUnaggregatedAttestation::verify`. This means it does not need to be provided unnecessarily by the validator client. - Move `fn map_attestation_committee` in `mod beacon_chain::attestation_verification` to a new `fn with_committee_cache` on the `BeaconChain` so the same cache can be used for obtaining validator duties. - Add some other helpers to `BeaconChain` to assist with common API duties (e.g., `block_root_at_slot`, `head_beacon_block_root`). - Change the `NaiveAggregationPool` so it can index attestations by `hash_tree_root(attestation.data)`. This is a requirement of the API. - Add functions to `BeaconChainHarness` to allow it to create slashings and exits. - Allow for `eth1::Eth1NetworkId` to go to/from a `String`. - Add functions to the `OperationPool` to allow getting all objects in the pool. - Add function to `BeaconState` to check if a committee cache is initialized. - Fix bug where `seconds_per_eth1_block` was not transferring over from `YamlConfig` to `ChainSpec`. - Add the `deposit_contract_address` to `YamlConfig` and `ChainSpec`. We needed to be able to return it in an API response. - Change some uses of serde `serialize_with` and `deserialize_with` to a single use of `with` (code quality). - Impl `Display` and `FromStr` for several BLS fields. - Check for clock discrepancy when VC polls BN for sync state (with +/- 1 slot tolerance). This is not intended to be comprehensive, it was just easy to do. - See #1434 for a per-endpoint overview. - Seeking clarity here: https://github.com/ethereum/eth2.0-APIs/issues/75 - [x] Add docs for prom port to close #1256 - [x] Follow up on this #1177 - [x] ~~Follow up with #1424~~ Will fix in future PR. - [x] Follow up with #1411 - [x] ~~Follow up with #1260~~ Will fix in future PR. - [x] Add quotes to all integers. - [x] Remove `rest_types` - [x] Address missing beacon block error. (#1629) - [x] ~~Add tests for lighthouse/peers endpoints~~ Wontfix - [x] ~~Follow up with validator status proposal~~ Tracked in #1434 - [x] Unify graffiti structs - [x] ~~Start server when waiting for genesis?~~ Will fix in future PR. - [x] TODO in http_api tests - [x] Move lighthouse endpoints off /eth/v1 - [x] Update docs to link to standard - ~~Blocked on #1586~~ Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2020-10-01 11:12:36 +10:00
blacktemplar	ae28773965	Networking bug fixes (#1684 ) * call correct unsubscribe method for subnets * correctly delegate closed connections in behaviour * correct unsubscribe method name	2020-09-29 18:28:15 +10:00
Paul Hauner	1ef4f0ea12	Add gossip conditions from spec v0.12.3 (#1667 ) ## Issue Addressed NA ## Proposed Changes There are four new conditions introduced in v0.12.3: 1. _[REJECT]_ The attestation's epoch matches its target -- i.e. `attestation.data.target.epoch == compute_epoch_at_slot(attestation.data.slot)` 1. _[REJECT]_ The attestation's target block is an ancestor of the block named in the LMD vote -- i.e. `get_ancestor(store, attestation.data.beacon_block_root, compute_start_slot_at_epoch(attestation.data.target.epoch)) == attestation.data.target.root` 1. _[REJECT]_ The committee index is within the expected range -- i.e. `data.index < get_committee_count_per_slot(state, data.target.epoch)`. 1. _[REJECT]_ The number of aggregation bits matches the committee size -- i.e. `len(attestation.aggregation_bits) == len(get_beacon_committee(state, data.slot, data.index))`. This PR implements new logic to suit (1) and (2). Tests are added for (3) and (4), although they were already implicitly enforced. ## Additional Info - There's a bit of edge-case with target root verification that I raised here: https://github.com/ethereum/eth2.0-specs/pull/2001#issuecomment-699246659 - I've had to add an `--ignore` to `cargo audit` to get CI to pass. See https://github.com/sigp/lighthouse/issues/1669	2020-09-27 20:59:40 +00:00
divma	b8013b7b2c	Super Silky Smooth Syncs, like a Sir (#1628 ) ## Issue Addressed In principle.. closes #1551 but in general are improvements for performance, maintainability and readability. The logic for the optimistic sync in actually simple ## Proposed Changes There are miscellaneous things here: - Remove unnecessary `BatchProcessResult::Partial` to simplify the batch validation logic - Make batches a state machine. This is done to ensure batch state transitions respect our logic (this was previously done by moving batches between `Vec`s) and to ease the cognitive load of the `SyncingChain` struct - Move most batch-related logic to the batch - Remove `PendingBatches` in favor of a map of peers to their batches. This is to avoid duplicating peers inside the chain (peer_pool and pending_batches) - Add `must_use` decoration to the `ProcessingResult` so that chains that request to be removed are handled accordingly. This also means that chains are now removed in more places than before to account for unhandled cases - Store batches in a sorted map (`BTreeMap`) access is not O(1) but since the number of _active_ batches is bounded this should be fast, and saves performing hashing ops. Batches are indexed by the epoch they start. Sorted, to easily handle chain advancements (range logic) - Produce the chain Id from the identifying fields: target root and target slot. This, to guarantee there can't be duplicated chains and be able to consistently search chains by either Id or checkpoint - Fix chain_id not being present in all chain loggers - Handle mega-edge case where the processor's work queue is full and the batch can't be sent. In this case the chain would lose the blocks, remain in a "syncing" state and waiting for a result that won't arrive, effectively stalling sync. - When a batch imports blocks or the chain starts syncing with a local finalized epoch greater that the chain's start epoch, the chain is advanced instead of reset. This is to avoid losing download progress and validate batches faster. This also means that the old `start_epoch` now means "current first unvalidated batch", so it represents more accurately the progress of the chain. - Batch status peers from the same chain to reduce Arc access. - Handle a couple of cases where the retry counters for a batch were not updated/checked are now handled via the batch state machine. Basically now if we forget to do it, we will know. - Do not send back the blocks from the processor to the batch. Instead register the attempt before sending the blocks (does not count as failed) - When re-requesting a batch, try to avoid not only the last failed peer, but all previous failed peers. - Optimize requesting batches ahead in the buffer by shuffling idle peers just once (this is just addressing a couple of old TODOs in the code) - In chain_collection, store chains by their id in a map - Include a mapping from request_ids to (chain, batch) that requested the batch to avoid the double O(n) search on block responses - Other stuff: - impl `slog::KV` for batches - impl `slog::KV` for syncing chains - PSA: when logging, we can use `%thing` if `thing` implements `Display`. Same for `?` and `Debug` ### Optimistic syncing: Try first the batch that contains the current head, if the batch imports any block, advance the chain. If not, if this optimistic batch is inside the current processing window leave it there for future use, if not drop it. The tolerance for this block is the same for downloading, but just once for processing Co-authored-by: Age Manning <Age@AgeManning.com>	2020-09-23 06:29:55 +00:00
Age Manning	80e52a0263	Subscribe to core topics after sync (#1613 ) ## Issue Addressed N/A ## Proposed Changes Prevent subscribing to core gossipsub topics until after we have achieved a full sync. This prevents us censoring gossipsub channels, getting penalised in gossipsub 1.1 scoring and saves us computation time in attempting to validate gossipsub messages which we will be unable to do with a non-sync'd chain.	2020-09-23 03:26:33 +00:00
Pawan Dhananjay	a97ec318c4	Subscribe to subnets an epoch in advance (#1600 ) ## Issue Addressed N/A ## Proposed Changes Subscibe to subnet an epoch in advance of the attestation slot instead of 4 slots in advance.	2020-09-22 07:29:34 +00:00
Pawan Dhananjay	14ff38539c	Add trusted peers (#1640 ) ## Issue Addressed Closes #1581 ## Proposed Changes Adds a new cli option for trusted peers who always have the maximum possible score.	2020-09-22 01:12:36 +00:00
Age Manning	1db8daae0c	Shift metadata to the global network variables (#1631 ) ## Issue Addressed N/A ## Proposed Changes Shifts the local `metadata` to `network_globals` making it accessible to the HTTP API and other areas of lighthouse. ## Additional Info N/A	2020-09-21 02:00:38 +00:00
Age Manning	c9596fcf0e	Temporary Sync Work-Around (#1615 ) ## Issue Addressed #1590 ## Proposed Changes This is a temporary workaround that prevents finalized chain sync from swapping chains. I'm merging this in now until the full solution is ready.	2020-09-13 23:58:49 +00:00
Age Manning	c6abc56113	Prevent large step-size parameters (#1583 ) ## Issue Addressed Malicious users could request very large block ranges, more than we expect. Although technically legal, we are now quadraticaly weighting large step sizes in the filter. Therefore users may request large skips, but not a large number of blocks, to prevent requests forcing us to do long chain lookups. ## Proposed Changes Weight the step parameter in the RPC filter and prevent any overflows that effect us in the step parameter. ## Additional Info	2020-09-11 02:33:36 +00:00
blacktemplar	7f1b936905	ignore too early / too late attestations instead of penalizing them (#1608 ) ## Issue Addressed NA ## Proposed Changes This ignores attestations that are too early or too late as it is specified in the spec (see https://github.com/ethereum/eth2.0-specs/blob/v0.12.1/specs/phase0/p2p-interface.md#global-topics first subpoint of `beacon_aggregate_and_proof`)	2020-09-11 01:43:15 +00:00
Age Manning	b19cf02d2d	Penalise bad peer behaviour (#1602 ) ## Issue Addressed #1386 ## Proposed Changes Penalises peers in our scoring system that produce invalid attestations or blocks.	2020-09-10 03:51:06 +00:00
Age Manning	fb9d828e5e	Extended Gossipsub metrics (#1577 ) ## Issue Addressed N/A ## Proposed Changes Adds extended metrics to get a better idea of what is happening at the gossipsub layer of lighthouse. This provides information about mesh statistics per topics, subscriptions and peer scores. ## Additional Info	2020-09-01 06:59:14 +00:00
blacktemplar	c18d37c202	Use Gossipsub 1.1 (#1516 ) ## Issue Addressed #1172 ## Proposed Changes * updates the libp2p dependency * small adaptions based on changes in libp2p * report not just valid messages but also invalid and distinguish between `IGNORE`d messages and `REJECT`ed messages Co-authored-by: Age Manning <Age@AgeManning.com>	2020-08-30 13:06:50 +00:00
Adam Szkoda	d9f4819fe0	Alternative (to BeaconChainHarness) BeaconChain testing API (#1380 ) The PR: * Adds the ability to generate a crucial test scenario that isn't possible with `BeaconChainHarness` (i.e. two blocks occupying the same slot; previously forks necessitated skipping slots): ![image](https://user-images.githubusercontent.com/165678/88195404-4bce3580-cc40-11ea-8c08-b48d2e1d5959.png) * New testing API: Instead of repeatedly calling add_block(), you generate a sorted `Vec<Slot>` and leave it up to the framework to generate blocks at those slots. * Jumping backwards to an earlier epoch is a hard error, so that tests necessarily generate blocks in a epoch-by-epoch manner. * Configures the test logger so that output is printed on the console in case a test fails. The logger also plays well with `--nocapture`, contrary to the existing testing framework * Rewrites existing fork pruning tests to use the new API * Adds a tests that triggers finalization at a non epoch boundary slot * Renamed `BeaconChainYoke` to `BeaconChainTestingRig` because the former has been too confusing * Fixed multiple tests (e.g. `block_production_different_shuffling_long`, `delete_blocks_and_states`, `shuffling_compatible_simple_fork`) that relied on a weird (and accidental) feature of the old `BeaconChainHarness` that attestations aren't produced for epochs earlier than the current one, thus masking potential bugs in test cases. Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2020-08-26 09:24:55 +00:00
Pawan Dhananjay	bbed42f30c	Refactor attestation service (#1415 ) ## Issue Addressed N/A ## Proposed Changes Refactor attestation service to send out requests to find peers for subnets as soon as we get attestation duties. Earlier, we had much more involved logic to send the discovery requests to the discovery service only 6 slots before the attestation slot. Now that discovery is much smarter with grouped queries, the complexity in attestation service can be reduced considerably. Co-authored-by: Age Manning <Age@AgeManning.com>	2020-08-19 08:46:25 +00:00
divma	fdc6e2aa8e	Shutdown like a Sir (#1545 ) ## Issue Addressed #1494 ## Proposed Changes - Give the TaskExecutor the sender side of a channel that a task can clone to request shutting down - The receiver side of this channel is in environment and now we block until ctrl+c or an internal shutdown signal is received - The swarm now informs when it has reached 0 listeners - The network receives this message and requests the shutdown	2020-08-19 05:51:14 +00:00
Paul Hauner	8e7dd7b2b1	Add remaining network ops to queuing system (#1546 ) ## Issue Addressed NA ## Proposed Changes - Refactors the `BeaconProcessor` to remove some excessive nesting and file bloat - Sorry about the noise from this, it's all contained in 4d3f8c5 though. - Adds exits, proposer slashings, attester slashings to the `BeaconProcessor` so we don't get overwhelmed with large amounts of slashings (which happened a few hours ago). ## Additional Info NA	2020-08-19 05:09:53 +00:00
Age Manning	2d0b214b57	Clean up logs (#1541 ) ## Description This PR improves some logging for the end-user. It downgrades some warning logs and removes the slots per second sync speed if we are syncing and the speed is 0. This is likely because we are syncing from a finalised checkpoint and the head doesn't change.	2020-08-18 08:11:39 +00:00
Age Manning	8311074d68	Purge out-dated head chains on chain completion (#1538 ) ## Description There can be many head chains queued up to complete. Currently we try and process all of these to completion before we consider the node synced. In a chaotic network, there can be many of these and processing them to completion can be very expensive and slow. This PR removes any non-syncing head chains from the queue, and re-status's the peers. If, after we have synced to head on one chain, there is still a valid head chain to download, it will be re-established once the status has been returned. This should assist with getting nodes to sync on medalla faster.	2020-08-18 05:22:34 +00:00
Age Manning	3bb30754d9	Keep track of failed head chains and prevent re-lookups (#1534 ) ## Overview There are forked chains which get referenced by blocks and attestations on a network. Typically if these chains are very long, we stop looking up the chain and downvote the peer. In extreme circumstances, many peers are on many chains, the chains can be very deep and become time consuming performing lookups. This PR adds a cache to known failed chain lookups. This prevents us from starting a parent-lookup (or stopping one half way through) if we have attempted the chain lookup in the past.	2020-08-18 03:54:09 +00:00
Age Manning	cc44a64d15	Limit parallelism of head chain sync (#1527 ) ## Description Currently lighthouse load-balances across peers a single finalized chain. The chain is selected via the most peers. Once synced to the latest finalized epoch Lighthouse creates chains amongst its peers and syncs them all in parallel amongst each peer (grouped by their current head block). This is typically fast and relatively efficient under normal operations. However if the chain has not finalized in a long time, the head chains can grow quite long. Peer's head chains will update every slot as new blocks are added to the head. Syncing all head chains in parallel is a bottleneck and highly inefficient in block duplication leads to RPC timeouts when attempting to handle all new heads chains at once. This PR limits the parallelism of head syncing chains to 2. We now sync at most two head chains at a time. This allows for the possiblity of sync progressing alongside a peer being slow and holding up one chain via RPC timeouts.	2020-08-18 02:49:24 +00:00
divma	46dbf027af	Do not reset batch ids & redownload out of range batches (#1528 ) The changes are somewhat simple but should solve two issues: - When quickly changing between chains once and a second time back again, batchIds would collide and cause havoc. - If we got an out of range response from a peer, sync would remain in syncing but without advancing Changes: - remove the batch id. Identify each batch (inside a chain) by its starting epoch. Target epochs for downloading and processing now advance by EPOCHS_PER_BATCH - for the same reason, move the "to_be_downloaded_id" to be an epoch - remove a sneaky line that dropped an out of range batch without downloading it - bonus: put the chain_id in the log given to the chain. This is why explicitly logging the chain_id is removed	2020-08-18 01:29:51 +00:00
Michael Sproul	719a69aee0	Ignore blocks that skip a large distance from their parent (#1530 ) ## Proposed Changes To mitigate the impact of minority forks on RAM and disk usage, this change rejects blocks whose parent lies more than 320 slots (10 epochs, ~1 hour) in the past. The behaviour is configurable via `lighthouse bn --max-skip-slots N`, and can be turned off entirely using `--max-skip-slots none`. Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-08-17 10:54:58 +00:00
Paul Hauner	f85485884f	Process gossip blocks on the GossipProcessor (#1523 ) ## Issue Addressed NA ## Proposed Changes Moves beacon block processing over to the newly-added `GossipProcessor`. This moves the task off the core executor onto the blocking one. ## Additional Info - With this PR, gossip blocks are being ignored during sync.	2020-08-17 09:20:27 +00:00
Age Manning	afdc4fea1d	Correct logic for peer sync identification (#1525 ) Fix a small sync bug which can mis-classify newly connected peers.	2020-08-17 03:00:10 +00:00

1 2 3 4 5 ...

323 Commits