lighthouse

Author	SHA1	Message	Date
Michael Sproul	3b61ac9cbf	Optimise slasher DB layout and switch to MDBX (#2776 ) ## Issue Addressed Closes #2286 Closes #2538 Closes #2342 ## Proposed Changes Part II of major slasher optimisations after #2767 These changes will be backwards-incompatible due to the move to MDBX (and the schema change) 😱 * [x] Shrink attester keys from 16 bytes to 7 bytes. * [x] Shrink attester records from 64 bytes to 6 bytes. * [x] Separate `DiskConfig` from regular `Config`. * [x] Add configuration for the LRU cache size. * [x] Add a "migration" that deletes any legacy LMDB database.	2021-12-21 08:23:17 +00:00
Michael Sproul	a290a3c537	Add configurable block replayer (#2863 ) ## Issue Addressed Successor to #2431 ## Proposed Changes * Add a `BlockReplayer` struct to abstract over the intricacies of calling `per_slot_processing` and `per_block_processing` while avoiding unnecessary tree hashing. * Add a variant of the forwards state root iterator that does not require an `end_state`. * Use the `BlockReplayer` when reconstructing states in the database. Use the efficient forwards iterator for frozen states. * Refactor the iterators to remove `Arc<HotColdDB>` (this seems to be neater than making _everything_ an `Arc<HotColdDB>` as I did in #2431). Supplying the state roots allow us to avoid building a tree hash cache at all when reconstructing historic states, which saves around 1 second flat (regardless of `slots-per-restore-point`). This is a small percentage of worst-case state load times with 200K validators and SPRP=2048 (~15s vs ~16s) but a significant speed-up for more frequent restore points: state loads with SPRP=32 should be now consistently <500ms instead of 1.5s (a ~3x speedup). ## Additional Info Required by https://github.com/sigp/lighthouse/pull/2628	2021-12-21 06:30:52 +00:00
Divma	56d596ee42	Unban peers at the swarm level when purged (#2855 ) ## Issue Addressed #2840	2021-12-20 23:45:21 +00:00
eklm	9be3d4ecac	Downgrade AttestationStateIsFinalized error to debug (#2866 ) ## Issue Addressed #2834 ## Proposed Changes Change log message severity from error to debug in attestation verification when attestation state is finalized.	2021-12-17 07:59:46 +00:00
Divma	eee0260a68	do not count dialing peers in the connection limit (#2856 ) ## Issue Addressed #2841 ## Proposed Changes Not counting dialing peers while deciding if we have reached the target peers in case of outbound peers. ## Additional Info Checked this running in nodes and bandwidth looks normal, peer count looks normal too	2021-12-15 05:48:45 +00:00
Michael Sproul	a43d5e161f	Optimise balances cache in case of skipped slots (#2849 ) ## Proposed Changes Remove the `is_first_block_in_epoch` logic from the balances cache update logic, as it was incorrect in the case of skipped slots. The updated code is simpler because regardless of whether the block is the first in the epoch we can check if an entry for the epoch boundary root already exists in the cache, and update the cache accordingly. Additionally, to assist with flip-flopping justified epochs, move to cloning the balance cache rather than moving it. This should still be very fast in practice because the balances cache is a ~1.6MB `Vec`, and this operation is expected to only occur infrequently.	2021-12-13 23:35:57 +00:00
realbigsean	b22ac95d7f	v1.1.6 Fork Choice changes (#2822 ) ## Issue Addressed Resolves: https://github.com/sigp/lighthouse/issues/2741 Includes: https://github.com/sigp/lighthouse/pull/2853 so that we can get ssz static tests passing here on v1.1.6. If we want to merge that first, we can make this diff slightly smaller ## Proposed Changes - Changes the `justified_epoch` and `finalized_epoch` in the `ProtoArrayNode` each to an `Option<Checkpoint>`. The `Option` is necessary only for the migration, so not ideal. But does allow us to add a default logic to `None` on these fields during the database migration. - Adds a database migration from a legacy fork choice struct to the new one, search for all necessary block roots in fork choice by iterating through blocks in the db. - updates related to https://github.com/ethereum/consensus-specs/pull/2727 - We will have to update the persisted forkchoice to make sure the justified checkpoint stored is correct according to the updated fork choice logic. This boils down to setting the forkchoice store's justified checkpoint to the justified checkpoint of the block that advanced the finalized checkpoint to the current one. - AFAICT there's no migration steps necessary for the update to allow applying attestations from prior blocks, but would appreciate confirmation on that - I updated the consensus spec tests to v1.1.6 here, but they will fail until we also implement the proposer score boost updates. I confirmed that the previously failing scenario `new_finalized_slot_is_justified_checkpoint_ancestor` will now pass after the boost updates, but haven't confirmed _all_ tests will pass because I just quickly stubbed out the proposer boost test scenario formatting. - This PR now also includes proposer boosting https://github.com/ethereum/consensus-specs/pull/2730 ## Additional Info I realized checking justified and finalized roots in fork choice makes it more likely that we trigger this bug: https://github.com/ethereum/consensus-specs/pull/2727 It's possible the combination of justified checkpoint and finalized checkpoint in the forkchoice store is different from in any block in fork choice. So when trying to startup our store's justified checkpoint seems invalid to the rest of fork choice (but it should be valid). When this happens we get an `InvalidBestNode` error and fail to start up. So I'm including that bugfix in this branch. Todo: - [x] Fix fork choice tests - [x] Self review - [x] Add fix for https://github.com/ethereum/consensus-specs/pull/2727 - [x] Rebase onto Kintusgi - [x] Fix `num_active_validators` calculation as @michaelsproul pointed out - [x] Clean up db migrations Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-13 20:43:22 +00:00
Pawan Dhananjay	e391b32858	Merge devnet 3 (#2859 ) ## Issue Addressed N/A ## Proposed Changes Changes required for the `merge-devnet-3`. Added some more non substantive renames on top of @realbigsean 's commit. Note: this doesn't include the proposer boosting changes in kintsugi v3. This devnet isn't running with the proposer boosting fork choice changes so if we are looking to merge https://github.com/sigp/lighthouse/pull/2822 into `unstable`, then I think we should just maintain this branch for the devnet temporarily. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-12 09:04:21 +00:00
Lion - dapplion	2984f4b474	Remove wrong duplicated comment (#2751 ) ## Issue Addressed Remove wrong duplicated comment. Comment was copied from ban_peer() but doesn't apply to unban_peer()	2021-12-06 05:34:15 +00:00
Mac L	a7a7edb6cf	Optimise snapshot cache for late blocks (#2832 ) ## Proposed Changes In the event of a late block, keep the block in the snapshot cache by cloning it. This helps us process new blocks quickly in the event the late block was re-org'd. Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-12-06 03:41:31 +00:00
realbigsean	b5f2764bae	fix cache miss justified balances calculation (#2852 ) ## Issue Addressed We were calculating justified balances incorrectly on cache misses in `set_justified_checkpoint` ## Proposed Changes Use the `get_effective_balances` method as opposed to `state.balances`, which returns exact balances Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-03 16:58:10 +00:00
realbigsean	a80ccc3a33	1.57.0 lints (#2850 ) ## Issue Addressed New rust lints ## Proposed Changes - Boxing some enum variants - removing some unused fields (is the validator lockfile unused? seemed so to me) ## Additional Info - some error fields were marked as dead code but are logged out in areas - left some dead fields in our ef test code because I assume they are useful for debugging? Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-03 04:44:30 +00:00
Pawan Dhananjay	f3c237cfa0	Restrict network limits based on merge fork epoch (#2839 )	2021-12-02 14:32:31 +11:00
Paul Hauner	144978f8f8	Remove duplicate slot_clock method (#2842 )	2021-12-02 14:29:59 +11:00
Paul Hauner	94385fe17b	Support legacy data directories (#2846 )	2021-12-02 14:29:59 +11:00
Paul Hauner	ab86b42874	Kintsugi Diva comments (#2836 ) * Remove TODOs * Fix typo	2021-12-02 14:29:59 +11:00
ethDreamer	c2f2813385	Cleanup Comments & Fix get_pow_block_hash_at_ttd() (#2835 )	2021-12-02 14:29:59 +11:00
Paul Hauner	1b56ebf85e	Kintsugi review comments (#2831 ) * Fix makefile * Return on invalid finalized block * Fix todo in gossip scoring * Require --merge for --fee-recipient * Bump eth2_serde_utils * Change schema versions * Swap hash/uint256 test_random impls * Use default for ExecutionPayload::empty * Check for DBs before removing * Remove kintsugi docker image * Fix CLI default value	2021-12-02 14:29:59 +11:00
Paul Hauner	82a81524e3	Bump crate versions (#2829 )	2021-12-02 14:29:57 +11:00
ethDreamer	f6748537db	Removed PowBlock struct that never got used (#2813 )	2021-12-02 14:29:20 +11:00
Paul Hauner	5f0fef2d1e	Kintsugi on_merge_block tests (#2811 ) * Start v1.1.5 updates * Implement new payload creation logic * Tidy, add comments * Remove unused error enums * Add validate payload for gossip * Refactor validate_merge_block * Split payload verification in per block processing * Add execute_payload * Tidy * Tidy * Start working on new fork choice tests * Fix failing merge block test * Skip block_lookup_failed test * Fix failing terminal block test * Fixes from self-review * Address review comments	2021-12-02 14:29:20 +11:00
pawan	44a7b37ce3	Increase network limits (#2796 ) Fix max packet sizes Fix max_payload_size function Add merge block test Fix max size calculation; fix up test Clear comments Add a payload_size_function Use safe arith for payload calculation Return an error if block too big in block production Separate test to check if block is over limit	2021-12-02 14:29:20 +11:00
Paul Hauner	afe59afacd	Ensure difficulty/hash/epoch overrides change the `ChainSpec` (#2798 ) * Unify loading of eth2_network_config * Apply overrides at lighthouse binary level * Remove duplicate override values * Add merge values to existing net configs * Make override flags global * Add merge fields to testing config * Add one to TTD * Fix failing engine tests * Fix test compile error * Remove TTD flags * Move get_eth2_network_config * Fix warn * Address review comments	2021-12-02 14:29:18 +11:00
Paul Hauner	47db682d7e	Implement engine API v1.0.0-alpha.4 (#2810 ) * Added ForkchoiceUpdatedV1 & GetPayloadV1 * Added ExecutePayloadV1 * Added new geth test vectors * Separated Json Object/Serialization Code into file * Deleted code/tests for Requests Removed from spec * Finally fixed serialization of null '0x' * Made Naming of JSON Structs Consistent * Fix clippy lints * Remove u64 payload id * Remove unused serde impls * Swap to [u8; 8] for payload id * Tidy * Adjust some block gen return vals * Tidy * Add fallback when payload id is unknown * Remove comment Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2021-12-02 14:26:55 +11:00
Paul Hauner	cdfd1304a5	Skip memory intensive engine test (#2809 ) * Allocate less memory (3GB) in engine tests * Run cargo format * Remove tx too large test Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-12-02 14:26:55 +11:00
Paul Hauner	cbd2201164	Fixes after rebasing Kintsugi onto unstable (#2799 ) * Fix fork choice after rebase * Remove paulhauner warp dep * Fix fork choice test compile errors * Assume fork choice payloads are valid * Add comment * Ignore new tests * Fix error in test skipping	2021-12-02 14:26:55 +11:00
Pawan Dhananjay	24966c059d	Fix Uint256 deserialization (#2786 ) * Change base_fee_per_gas to Uint256 * Add custom (de)serialization to ExecutionPayload * Fix errors * Add a quoted_u256 module * Remove unused function * lint * Add test * Remove extra line Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-02 14:26:55 +11:00
realbigsean	de49c7ddaa	1.1.5 merge spec tests (#2781 ) * Fix arbitrary check kintsugi * Add merge chain spec fields, and a function to determine which constant to use based on the state variant * increment spec test version * Remove `Transaction` enum wrapper * Remove Transaction new-type * Remove gas validations * Add `--terminal-block-hash-epoch-override` flag * Increment spec tests version to 1.1.5 * Remove extraneous gossip verification https://github.com/ethereum/consensus-specs/pull/2687 * - Remove unused Error variants - Require both "terminal-block-hash-epoch-override" and "terminal-block-hash-override" when either flag is used * - Remove a couple more unused Error variants Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-02 14:26:55 +11:00
Paul Hauner	86e0c56a38	Kintsugi rebase patches (#2769 ) * Freshen Cargo.lock * Fix gossip worker * Update map_fork_name_with	2021-12-02 14:26:54 +11:00
Paul Hauner	6b4cc63b57	Accept TTD override as decimal (#2676 )	2021-12-02 14:26:54 +11:00
realbigsean	d8eec16c5e	v1.1.1 spec updates (#2684 ) * update initializing from eth1 for merge genesis * read execution payload header from file lcli * add `create-payload-header` command to `lcli` * fix base fee parsing * Apply suggestions from code review * default `execution_payload_header` bool to false when deserializing `meta.yml` in EF tests Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-02 14:26:54 +11:00
Paul Hauner	6dde12f311	[Merge] Optimistic Sync: Stage 1 (#2686 ) * Add payload verification status to fork choice * Pass payload verification status to import_block * Add valid back-propagation * Add head safety status latch to API * Remove ExecutionLayerStatus * Add execution info to client notifier * Update notifier logs * Change use of "hash" to refer to beacon block * Shutdown on invalid finalized block * Tidy, add comments * Fix failing FC tests * Allow blocks with unsafe head * Fix forkchoiceUpdate call on startup	2021-12-02 14:26:54 +11:00
Pawan Dhananjay	aa1d57aa55	Fix db paths when datadir is relative (#2682 )	2021-12-02 14:26:53 +11:00
Paul Hauner	67a6f91df6	[Merge] Optimistic EL verification (#2683 ) * Ignore payload errors * Only return payload handle on valid response * Push some engine logs down to debug * Push ee fork choice log to debug * Push engine call failure to debug * Push some more errors to debug * Fix panic at startup	2021-12-02 14:26:53 +11:00
Paul Hauner	35350dff75	[Merge] Block validator duties when EL is not ready (#2672 ) * Reject some HTTP endpoints when EL is not ready * Restrict more endpoints * Add watchdog task * Change scheduling * Update to new schedule * Add "syncing" concept * Remove RequireSynced * Add is_merge_complete to head_info * Cache latest_head in Engines * Call consensus_forkchoiceUpdate on startup	2021-12-02 14:26:53 +11:00
Paul Hauner	d6fda44620	Disable notifier logging from dummy eth1 backend (#2680 )	2021-12-02 14:26:53 +11:00
ethDreamer	52e5083502	Fixed bugs for m3 readiness (#2669 ) * Fixed bugs for m3 readiness * woops * cargo fmt..	2021-12-02 14:26:53 +11:00
Paul Hauner	b162b067de	Misc changes for merge testnets (#2667 ) * Thread eth1_block_hash into interop genesis state * Add merge-fork-epoch flag * Build LH with minimal spec by default * Add verbose logs to execution_layer * Add --http-allow-sync-stalled flag * Update lcli new-testnet to create genesis state * Fix http test * Fix compile errors in tests	2021-12-02 14:26:52 +11:00
Paul Hauner	a1033a9247	Add `BeaconChainHarness` tests for The Merge (#2661 ) * Start adding merge tests * Expose MockExecutionLayer * Add mock_execution_layer to BeaconChainHarness * Progress with merge test * Return more detailed errors with gas limit issues * Use a better gas limit in block gen * Ensure TTD is met in block gen * Fix basic_merge tests * Start geth testing * Fix conflicts after rebase * Remove geth tests * Improve merge test * Address clippy lints * Make pow block gen a pure function * Add working new test, breaking existing test * Fix test names * Add should_panic * Don't run merge tests in debug * Detect a tokio runtime when starting MockServer * Fix clippy lint, include merge tests	2021-12-02 14:26:52 +11:00
Paul Hauner	801f6f7425	Disable autotests for beacon_chain (#2658 )	2021-12-02 14:26:52 +11:00
Paul Hauner	01031931d9	[Merge] Add execution API test vectors from Geth (#2651 ) * Add geth request vectors * Add geth response vectors * Fix clippy lints	2021-12-02 14:26:52 +11:00
Paul Hauner	20ca7a56ed	[Merge] Add serde impls for `Transactions` type (#2649 ) * Start implemented serde for transactions * Revise serde impl * Add tests for transaction decoding	2021-12-02 14:26:51 +11:00
Paul Hauner	d8623cfc4f	[Merge] Implement `execution_layer` (#2635 ) * Checkout serde_utils from rayonism * Make eth1::http functions pub * Add bones of execution_layer * Modify decoding * Expose Transaction, cargo fmt * Add executePayload * Add all minimal spec endpoints * Start adding json rpc wrapper * Finish custom JSON response handler * Switch to new rpc sending method * Add first test * Fix camelCase * Finish adding tests * Begin threading execution layer into BeaconChain * Fix clippy lints * Fix clippy lints * Thread execution layer into ClientBuilder * Add CLI flags * Add block processing methods to ExecutionLayer * Add block_on to execution_layer * Integrate execute_payload * Add extra_data field * Begin implementing payload handle * Send consensus valid/invalid messages * Fix minor type in task_executor * Call forkchoiceUpdated * Add search for TTD block * Thread TTD into execution layer * Allow producing block with execution payload * Add LRU cache for execution blocks * Remove duplicate 0x on ssz_types serialization * Add tests for block getter methods * Add basic block generator impl * Add is_valid_terminal_block to EL * Verify merge block in block_verification * Partially implement --terminal-block-hash-override * Add terminal_block_hash to ChainSpec * Remove Option from terminal_block_hash in EL * Revert merge changes to consensus/fork_choice * Remove commented-out code * Add bones for handling RPC methods on test server * Add first ExecutionLayer tests * Add testing for finding terminal block * Prevent infinite loops * Add insert_merge_block to block gen * Add block gen test for pos blocks * Start adding payloads to block gen * Fix clippy lints * Add execution payload to block gen * Add execute_payload to block_gen * Refactor block gen * Add all routes to mock server * Use Uint256 for base_fee_per_gas * Add working execution chain build * Remove unused var * Revert "Use Uint256 for base_fee_per_gas" This reverts commit 6c88f19ac45db834dd4dbf7a3c6e7242c1c0f735. * Fix base_fee_for_gas Uint256 * Update execute payload handle * Improve testing, fix bugs * Fix default fee-recipient * Fix fee-recipient address (again) * Add check for terminal block, add comments, tidy * Apply suggestions from code review Co-authored-by: realbigsean <seananderson33@GMAIL.com> * Fix is_none on handle Drop * Remove commented-out tests Co-authored-by: realbigsean <seananderson33@GMAIL.com>	2021-12-02 14:26:51 +11:00
ethDreamer	1563bce905	Finished Gossip Block Validation Conditions (#2640 ) * Gossip Block Validation is Much More Efficient Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-02 14:26:51 +11:00
realbigsean	aa534f8989	Store execution block hash in fork choice (#2643 ) * - Update the fork choice `ProtoNode` to include `is_merge_complete` - Add database migration for the persisted fork choice * update tests * Small cleanup * lints * store execution block hash in fork choice rather than bool	2021-12-02 14:26:51 +11:00
Paul Hauner	c10e8ce955	Fix clippy lints on merge-f2f (#2626 ) * Remove unchecked arith from ssz_derive * Address clippy lints in block_verfication * Use safe math for is_valid_gas_limit	2021-12-02 14:26:50 +11:00
Mark Mackey	5687c56d51	Initial merge changes Added Execution Payload from Rayonism Fork Updated new Containers to match Merge Spec Updated BeaconBlockBody for Merge Spec Completed updating BeaconState and BeaconBlockBody Modified ExecutionPayload<T> to use Transaction<T> Mostly Finished Changes for beacon-chain.md Added some things for fork-choice.md Update to match new fork-choice.md/fork.md changes ran cargo fmt Added Missing Pieces in eth2_libp2p for Merge fix ef test Various Changes to Conform Closer to Merge Spec	2021-12-02 14:26:50 +11:00
Mac L	fe75a0a9a1	Add background file logging (#2762 ) ## Issue Addressed Closes #1996 ## Proposed Changes Run a second `Logger` via `sloggers` which logs to a file in the background with: - separate `debug-level` for background and terminal logging - the ability to limit log size - rotation through a customizable number of log files - an option to compress old log files (`.gz` format) Add the following new CLI flags: - `--logfile-debug-level`: The debug level of the log files - `--logfile-max-size`: The maximum size of each log file - `--logfile-max-number`: The number of old log files to store - `--logfile-compress`: Whether to compress old log files By default background logging uses the `debug` log level and saves logfiles to: - Beacon Node: `$HOME/.lighthouse/$network/beacon/logs/beacon.log` - Validator Client: `$HOME/.lighthouse/$network/validators/logs/validator.log` Or, when using the `--datadir` flag: `$datadir/beacon/logs/beacon.log` and `$datadir/validators/logs/validator.log` Once rotated, old logs are stored like so: `beacon.log.1`, `beacon.log.2` etc. > Note: `beacon.log.1` is always newer than `beacon.log.2`. ## Additional Info Currently the default value of `--logfile-max-size` is 200 (MB) and `--logfile-max-number` is 5. This means that the maximum storage space that the logs will take up by default is 1.2GB. (200MB x 5 from old log files + <200MB the current logfile being written to) Happy to adjust these default values to whatever people think is appropriate. It's also worth noting that when logging to a file, we lose our custom `slog` formatting. This means the logfile logs look like this: ``` Oct 27 16:02:50.305 INFO Lighthouse started, version: Lighthouse/v2.0.1-8edd9d4+, module: lighthouse:413 Oct 27 16:02:50.305 INFO Configured for network, name: prater, module: lighthouse:414 ```	2021-11-30 03:25:32 +00:00
Age Manning	6625aa4afe	Status'd Peer Not Found (#2761 ) ## Issue Addressed Users are experiencing `Status'd peer not found` errors ## Proposed Changes Although I cannot reproduce this error, this is only one connection state change that is not addressed in the peer manager (that I could see). The error occurs because the number of disconnected peers in the peerdb becomes out of sync with the actual number of disconnected peers. From what I can tell almost all possible connection state changes are handled, except for the case when a disconnected peer changes to be disconnecting. This can potentially happen at the peer connection limit, where a previously connected peer switches to disconnecting. This PR decrements the disconnected counter when this event occurs and from what I can tell, covers all possible disconnection state changes in the peer manager.	2021-11-28 22:46:17 +00:00
Divma	413b0b5b2b	Correctly update range status when outdated chains are removed (#2827 ) We were batch removing chains when purging, and then updating the status of the collection for each of those. This makes the range status be out of sync with the real status. This represented no harm to the global sync status, but I've changed it to comply with a correct debug assertion that I got triggered while doing some testing. Also added tests and improved code quality as per @paulhauner 's suggestions.	2021-11-26 01:13:49 +00:00
Pawan Dhananjay	9eedb6b888	Allow additional subnet peers (#2823 ) ## Issue Addressed N/A ## Proposed Changes 1. Don't disconnect peer from dht on connection limit errors 2. Bump up `PRIORITY_PEER_EXCESS` to allow for dialing upto 60 peers by default. Co-authored-by: Diva M <divma@protonmail.com>	2021-11-25 21:27:08 +00:00
Michael Sproul	2c07a72980	Revert peer DB changes from #2724 (#2828 ) ## Proposed Changes This reverts commit `53562010ec` from PR #2724 Hopefully this will restore the reliability of the sync simulator.	2021-11-25 03:45:52 +00:00
Age Manning	0b319d4926	Inform dialing via the behaviour (#2814 ) I had this change but it seems to have been lost in chaos of network upgrades. The swarm dialing event seems to miss some cases where we dial via the behaviour. This causes an error to be logged as the peer manager doesn't know about some dialing events. This shifts the logic to the behaviour to inform the peer manager.	2021-11-19 04:42:33 +00:00
Divma	53562010ec	Move peer db writes to eth2 libp2p (#2724 ) ## Issue Addressed Part of a bigger effort to make the network globals read only. This moves all writes to the `PeerDB` to the `eth2_libp2p` crate. Limiting writes to the peer manager is a slightly more complicated issue for a next PR, to keep things reviewable. ## Proposed Changes - Make the peers field in the globals a private field. - Allow mutable access to the peers field to `eth2_libp2p` for now. - Add a new network message to update the sync state. Co-authored-by: Age Manning <Age@AgeManning.com>	2021-11-19 04:42:31 +00:00
Divma	31386277c3	Sync wrong dbg assertion (#2821 ) ## Issue Addressed Running a beacon node I triggered a sync debug panic. And so finally the time to create tests for sync arrived. Fortunately, te bug was not in the sync algorithm itself but a wrong assertion ## Proposed Changes - Split Range's impl from the BeaconChain via a trait. This is needed for testing. The TestingRig/Harness is way bigger than needed and does not provide the modification functionalities that are needed to test sync. I find this simpler, tho some could disagree. - Add a regression test for sync that fails before the changes. - Fix the wrong assertion.	2021-11-19 02:38:25 +00:00
Age Manning	e519af9012	Update Lighthouse Dependencies (#2818 ) ## Issue Addressed Updates lighthouse dependencies to resolve audit issues in out-dated deps.	2021-11-18 05:08:42 +00:00
Pawan Dhananjay	e32c09bfda	Fix decoding max length (#2816 ) ## Issue Addressed N/A ## Proposed Changes Fix encoder max length to the correct value (`MAX_RPC_SIZE`).	2021-11-16 22:23:39 +00:00
Age Manning	a43a2448b7	Investigate and correct RPC Response Timeouts (#2804 ) RPC Responses are for some reason not removing their timeout when they are completing. As an example: ``` Nov 09 01:18:20.256 DEBG Received BlocksByRange Request step: 1, start_slot: 728465, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.263 DEBG Received BlocksByRange Request step: 1, start_slot: 728593, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.483 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728465, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:20.500 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466389, start_slot: 728593, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.068 DEBG Received BlocksByRange Request step: 1, start_slot: 728529, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:21.272 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466389, start_slot: 728529, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.434 DEBG Received BlocksByRange Request step: 1, start_slot: 728657, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:23.665 DEBG BlocksByRange Response sent returned: 64, requested: 64, current_slot: 2466390, start_slot: 728657, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728337, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:25.851 DEBG Received BlocksByRange Request step: 1, start_slot: 728401, count: 64, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.094 DEBG BlocksByRange Response sent returned: 62, requested: 64, current_slot: 2466390, start_slot: 728401, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:26.100 DEBG BlocksByRange Response sent returned: 63, requested: 64, current_slot: 2466390, start_slot: 728337, msg: Failed to return all requested blocks, peer: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw Nov 09 01:18:31.070 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.070 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.085 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.085 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:31.459 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:31.459 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:34.129 DEBG RPC Error direction: Incoming, score: 0, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, client: Prysm: version: a80b1c252a9b4773493b41999769bf3134ac373f, os_version: unknown, err: Stream Timeout, protocol: beacon_blocks_by_range, service: libp2p Nov 09 01:18:34.130 WARN Timed out to a peer's request. Likely insufficient resources, reduce peer count, service: libp2p Nov 09 01:18:35.686 DEBG Peer Manager disconnecting peer reason: Too many peers, peer_id: 16Uiu2HAmEmBURejquBUMgKAqxViNoPnSptTWLA2CfgSPnnKENBNw, service: libp2p ``` This PR is to investigate and correct the issue. ~~My current thoughts are that for some reason we are not closing the streams correctly, or fast enough, or the executor is not registering the closes and waking up.~~ - Pretty sure this is not the case, see message below for a more accurate reason. ~~I've currently added a timeout to stream closures in an attempt to force streams to close and the future to always complete.~~ I removed this	2021-11-16 03:42:25 +00:00
Paul Hauner	931daa40d7	Add fork choice EF tests (#2737 ) ## Issue Addressed Resolves #2545 ## Proposed Changes Adds the long-overdue EF tests for fork choice. Although we had pretty good coverage via other implementations that closely followed our approach, it is nonetheless important for us to implement these tests too. During testing I found that we were using a hard-coded `SAFE_SLOTS_TO_UPDATE_JUSTIFIED` value rather than one from the `ChainSpec`. This caused a failure during a minimal preset test. This doesn't represent a risk to mainnet or testnets, since the hard-coded value matched the mainnet preset. ## Failing Cases There is one failing case which is presently marked as `SkippedKnownFailure`: ``` case 4 ("new_finalized_slot_is_justified_checkpoint_ancestor") from /home/paul/development/lighthouse/testing/ef_tests/consensus-spec-tests/tests/minimal/phase0/fork_choice/on_block/pyspec_tests/new_finalized_slot_is_justified_checkpoint_ancestor failed with NotEqual: head check failed: Got Head { slot: Slot(40), root: 0x9183dbaed4191a862bd307d476e687277fc08469fc38618699863333487703e7 } \| Expected Head { slot: Slot(24), root: 0x105b49b51bf7103c182aa58860b039550a89c05a4675992e2af703bd02c84570 } ``` This failure is due to #2741. It's not a particularly high priority issue at the moment, so we fix it after merging this PR.	2021-11-08 07:29:04 +00:00
Divma	fbafe416d1	Move the peer manager to be a behaviour (#2773 ) This simply moves some functions that were "swarm notifications" to a network behaviour implementation. Notes ------ - We could disconnect from the peer manager but we would lose the rpc shutdown message - We still notify from the swarm since this is the most reliable way to get some events. Ugly but best for now - Events need to be pushed with "add event" to wake the waker Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>	2021-11-08 00:01:10 +00:00
Michael Sproul	df02639b71	De-duplicate attestations in the slasher (#2767 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2112 Closes https://github.com/sigp/lighthouse/issues/1861 ## Proposed Changes Collect attestations by validator index in the slasher, and use the magic of reference counting to automatically discard redundant attestations. This results in us storing only 1-2% of the attestations observed when subscribed to all subnets, which carries over to a 50-100x reduction in data stored 🎉 ## Additional Info There's some nuance to the configuration of the `slot-offset`. It has a profound effect on the effictiveness of de-duplication, see the docs added to the book for an explanation: `5442e695e5/book/src/slasher.md (slot-offset)`	2021-11-08 00:01:09 +00:00
Divma	a683e0296a	Peer manager cfg (#2766 ) ## Issue Addressed I've done this change in a couple of WIPs already so I might as well submit it on its own. This changes no functionality but reduces coupling in a 0.0001%. It also helps new people who need to work in the peer manager to better understand what it actually needs from the outside ## Proposed Changes Add a config to the peer manager	2021-11-03 23:44:44 +00:00
Divma	7502970a7d	Do not compute metrics in the network service if the cli flag is not set (#2765 ) ## Issue Addressed The computation of metrics in the network service can be expensive. This disables the computation unless the cli flag `metrics` is set. ## Additional Info Metrics in other parts of the network are still updated, since most are simple metrics and checking if metrics are enabled each time each metric is updated doesn't seem like a gain.	2021-11-03 00:06:03 +00:00
realbigsean	c4ad0e3fb3	Ensure dependent root consistency in head events (#2753 ) ## Issue Addressed @paulhauner noticed that when we send head events, we use the block root from `new_head` in `fork_choice_internal`, but calculate `dependent_root` and `previous_dependent_root` using the `canonical_head`. This is normally fine because `new_head` updates the `canonical_head` in `fork_choice_internal`, but it's possible we have a reorg updating `canonical_head` before our head events are sent. So this PR ensures `dependent_root` and `previous_dependent_root` are always derived from the state associated with `new_head`. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-11-02 02:26:32 +00:00
Pawan Dhananjay	4499adc7fd	Check proposer index during block production (#2740 ) ## Issue Addressed Resolves #2612 ## Proposed Changes Implements both the checks mentioned in the original issue. 1. Verifies the `randao_reveal` in the beacon node 2. Cross checks the proposer index after getting back the block from the beacon node. ## Additional info The block production time increases by ~10x because of the signature verification on the beacon node (based on the `beacon_block_production_process_seconds` metric) when running on a local testnet.	2021-11-01 07:44:40 +00:00
Michael Sproul	ffb04e1a9e	Add op pool metrics for attestations (#2758 ) ## Proposed Changes Add several metrics for the number of attestations in the op pool. These give us a way to observe the number of valid, non-trivial attestations during block packing rather than just the size of the entire op pool.	2021-11-01 05:52:31 +00:00
Divma	e2c0650d16	Relax late sync committee penalty (#2752 ) ## Issue Addressed Getting too many peers kicked due to slightly late sync committee messages as tested on.. under-performant hardware. ## Proposed Changes Only penalize if the message is more than one slot late. Still ignore the message- Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>	2021-10-31 22:30:19 +00:00
Age Manning	1790010260	Upgrade to latest libp2p (#2605 ) This is a pre-cursor to the next libp2p upgrade. It is currently being used for staging a number of PR upgrades which are contingent on the latest libp2p.	2021-10-29 01:59:29 +00:00
ethDreamer	2c4413454a	Fixed Gossip Topics on Fork Boundary (#2619 ) ## Issue Addressed The [p2p-interface section of the `altair` spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#transitioning-the-gossip) says you should subscribe to the topics for a fork "In advance of the fork" and unsubscribe from old topics `2 Epochs` after the new fork is activated. We've chosen to subscribe to new fork topics `2 slots` before the fork is initiated. This function is supposed to return the required fork digests at any given time but as it was currently written, it doesn't return the fork digest for a previous fork if you've switched to the current fork less than 2 epoch's ago. Also this function required modification for every new fork we add. ## Proposed Changes Make this function fork-agnostic and correctly handle the previous fork topic digests when you've only just switched to the new fork.	2021-10-29 00:05:27 +00:00
Pawan Dhananjay	88063398f6	Prevent double import of blocks (#2647 ) ## Issue Addressed Resolves #2611 ## Proposed Changes Adds a duplicate block root cache to the `BeaconProcessor`. Adds the block root to the cache before calling `process_gossip_block` and `process_rpc_block`. Since `process_rpc_block` is called only for single block lookups, we don't have to worry about batched block imports. The block is imported from the source(gossip/rpc) that arrives first. The block that arrives second is not imported to avoid the db access issue. There are 2 cases: 1. Block that arrives second is from rpc: In this case, we return an optimistic `BlockError::BlockIsAlreadyKnown` to sync. 2. Block that arrives second is from gossip: In this case, we only do gossip verification and forwarding but don't import the block into the the beacon chain. ## Additional info Splits up `process_gossip_block` function to `process_gossip_unverified_block` and `process_gossip_verified_block`.	2021-10-28 03:36:14 +00:00
Michael Sproul	2dc6163043	Add API version headers and `map_fork_name!` (#2745 ) ## Proposed Changes * Add the `Eth-Consensus-Version` header to the HTTP API for the block and state endpoints. This is part of the v2.1.0 API that was recently released: https://github.com/ethereum/beacon-APIs/pull/170 * Add tests for the above. I refactored the `eth2` crate's helper functions to make this more straight-forward, and introduced some new mixin traits that I think greatly improve readability and flexibility. * Add a new `map_with_fork!` macro which is useful for decoding a superstruct type without naming all its variants. It is now used for SSZ-decoding `BeaconBlock` and `BeaconState`, and for JSON-decoding `SignedBeaconBlock` in the API. ## Additional Info The `map_with_fork!` changes will conflict with the Merge changes, but when resolving the conflict the changes from this branch should be preferred (it is no longer necessary to enumerate every fork). The merge fork _will_ need to be added to `map_fork_name_with`.	2021-10-28 01:18:04 +00:00
Mac L	8edd9d45ab	Fix purge-db edge case (#2747 ) ## Issue Addressed Currently, if you launch the beacon node with the `--purge-db` flag and the `beacon` directory exists, but one (or both) of the `chain_db` or `freezer-db` directories are missing, it will error unnecessarily with: ``` Failed to remove chain_db: No such file or directory (os error 2) ``` This is an edge case which can occur in cases of manual intervention (a user deleted the directory) or if you had previously run with the `--purge-db` flag and Lighthouse errored before it could initialize the db directories. ## Proposed Changes Check if the `chain_db`/`freezer_db` exists before attempting to remove them. This prevents unnecessary errors.	2021-10-25 22:11:28 +00:00
Divma	d4819bfd42	Add a waker to the RPC handler (#2721 ) ## Issue Addressed Attempts to fix #2701 but I doubt this is the reason behind that. ## Proposed Changes maintain a waker in the rpc handler and call it if an event is received	2021-10-21 06:14:36 +00:00
Pawan Dhananjay	de34001e78	Update `next_fork_subscriptions` correctly (#2688 ) ## Issue Addressed N/A ## Proposed Changes Update the `next_fork_subscriptions` timer only after a fork happens.	2021-10-21 04:38:44 +00:00
divma	99f7a7db58	remove double backfill sync state (#2733 ) ## Issue Addressed In the backfill sync the state was maintained twice, once locally and also in the globals. This makes it so that it's maintained only once. The only behavioral change is that when backfill sync in paused, the global backfill state is updated. I asked @AgeManning about this and he deemed it a bug, so this solves it.	2021-10-19 22:32:25 +00:00
Michael Sproul	aad397f00a	Resolve Rust 1.56 lints and warnings (#2728 ) ## Issue Addressed When compiling with Rust 1.56.0 the compiler generates 3 instances of this warning: ``` warning: trailing semicolon in macro used in expression position --> common/eth2_network_config/src/lib.rs:181:24 \| 181 \| })?; \| ^ ... 195 \| let deposit_contract_deploy_block = load_from_file!(DEPLOY_BLOCK_FILE); \| ---------------------------------- in this macro invocation \| = note: `#[warn(semicolon_in_expressions_from_macros)]` on by default = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release! = note: for more information, see issue #79813 <https://github.com/rust-lang/rust/issues/79813> = note: this warning originates in the macro `load_from_file` (in Nightly builds, run with -Z macro-backtrace for more info) ``` This warning is completely harmless, but will be visible to users compiling Lighthouse v2.0.1 (or earlier) with Rust 1.56.0 (to be released October 21st). It is completely safe to ignore this warning, it's just a superficial change to Rust's syntax. ## Proposed Changes This PR removes the semi-colon as recommended, and fixes the new Clippy lints from 1.56.0	2021-10-19 00:30:42 +00:00
Akihito Nakano	efec60ee90	Tiny fix: wrong log level (#2720 ) ## Proposed Changes If the `RemoveChain` is critical log level should be crit. 🙂	2021-10-19 00:30:41 +00:00
Michael Sproul	d2e3d4c6f1	Add flag to disable lock timeouts (#2714 ) ## Issue Addressed Mitigates #1096 ## Proposed Changes Add a flag to the beacon node called `--disable-lock-timeouts` which allows opting out of lock timeouts. The lock timeouts serve a dual purpose: 1. They prevent any single operation from hogging the lock for too long. When a timeout occurs it logs a nasty error which indicates that there's suboptimal lock use occurring, which we can then act on. 2. They allow deadlock detection. We're fairly sure there are no deadlocks left in Lighthouse anymore but the timeout locks offer a safeguard against that. However, timeouts on locks are not without downsides: They allow for the possibility of livelock, particularly on slower hardware. If lock timeouts keep failing spuriously the node can be prevented from making any progress, even if it would be able to make progress slowly without the timeout. One particularly concerning scenario which could occur would be if a DoS attack succeeded in slowing block signature verification times across the network, and all Lighthouse nodes got livelocked because they timed out repeatedly. This could also occur on just a subset of nodes (e.g. dual core VPSs or Raspberri Pis). By making the behaviour runtime configurable this PR allows us to choose the behaviour we want depending on circumstance. I suspect that long term we could make the timeout-free approach the default (#2381 moves in this direction) and just enable the timeouts on our testnet nodes for debugging purposes. This PR conservatively leaves the default as-is so we can gain some more experience before switching the default.	2021-10-19 00:30:40 +00:00
Age Manning	df40700ddd	Rename eth2_libp2p to lighthouse_network (#2702 ) ## Description The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration. As of this writing it currently houses the following high-level lighthouse-specific logic: - Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings - Integration and handling of ENRs with respect to libp2p and eth2 - Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers. - Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc. - Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api. - Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks. - Lighthouse specific types for managing gossipsub topics, sync status and ENR fields - Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring - Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p. Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network` Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-10-19 00:30:39 +00:00
Paul Hauner	fff01b24dd	Release v2.0.1 (#2726 ) ## Issue Addressed NA ## Proposed Changes - Update versions to `v2.0.1` in anticipation for a release early next week. - Add `--ignore` to `cargo audit`. See #2727. ## Additional Info NA	2021-10-18 03:08:32 +00:00
Age Manning	180c90bf6d	Correct peer connection transition logic (#2725 ) ## Description This PR updates the peer connection transition logic. It is acceptable for a peer to immediately transition from a disconnected state to a disconnecting state. This can occur when we are at our peer limit and a new peer's dial us.	2021-10-17 04:04:36 +00:00
Paul Hauner	a7b675460d	Add Altair tests to op pool (#2723 ) ## Issue Addressed NA ## Proposed Changes Adds some more testing for Altair to the op pool. Credits to @michaelsproul for some appropriated efforts here. ## Additional Info NA Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-10-16 05:07:23 +00:00
Michael Sproul	5cde3fc4da	Reduce lock contention in backfill sync (#2716 ) ## Proposed Changes Clone the proposer pubkeys during backfill signature verification to reduce the time that the pubkey cache lock is held for. Cloning such a small number of pubkeys has negligible impact on the total running time, but greatly reduces lock contention. On a Ryzen 5950X, the setup step seems to take around 180us regardless of whether the key is cloned or not, while the verification takes 7ms. When Lighthouse is limited to 10% of one core using `sudo cpulimit --pid <pid> --limit 10` the total time jumps up to 800ms, but the setup step remains only 250us. This means that under heavy load this PR could cut the time the lock is held for from 800ms to 250us, which is a huge saving of 99.97%!	2021-10-15 03:28:03 +00:00
Paul Hauner	9c5a8ab7f2	Change "too many resources" to "insufficient resources" in eth2_libp2p (#2713 ) ## Issue Addressed NA ## Proposed Changes Fixes what I assume is a typo in a log message. See the diff for details. ## Additional Info NA	2021-10-15 00:07:12 +00:00
Age Manning	05040e68ec	Update discovery (#2711 ) ## Issue Addressed #2695 ## Proposed Changes This updates discovery to the latest version which has patched a panic that occurred due to a race condition in the bucket logic.	2021-10-14 22:09:38 +00:00
Paul Hauner	e2d09bb8ac	Add `BeaconChainHarness::builder` (#2707 ) ## Issue Addressed NA ## Proposed Changes This PR is near-identical to https://github.com/sigp/lighthouse/pull/2652, however it is to be merged into `unstable` instead of `merge-f2f`. Please see that PR for reasoning. I'm making this duplicate PR to merge to `unstable` in an effort to shrink the diff between `unstable` and `merge-f2f` by doing smaller, lead-up PRs. ## Additional Info NA	2021-10-14 02:58:10 +00:00
Pawan Dhananjay	34d22b5920	Reduce validator monitor logging verbosity (#2606 ) ## Issue Addressed Resolves #2541 ## Proposed Changes Reduces verbosity of validator monitor per epoch logging by batching info logs for multiple validators. Instead of a log for every validator managed by the validator monitor, we now batch logs for attestation records for previous epoch. Before: ```log Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 1, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 2, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 3, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 4, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 5, epoch: 65875, matched_head: false, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match head validator: 5, epoch: 65875, service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 6, epoch: 65875, matched_head: false, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match head validator: 6, epoch: 65875, service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 7, epoch: 65875, matched_head: true, matched_target: false, inclusion_lag: 1 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match target validator: 7, epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Sub-optimal inclusion delay validator: 7, epoch: 65875, optimal: 1, delay: 2, service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 8, epoch: 65875, matched_head: true, matched_target: false, inclusion_lag: 1 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match target validator: 8, epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Sub-optimal inclusion delay validator: 8, epoch: 65875, optimal: 1, delay: 2, service: val_mon Sep 20 06:53:08.239 ERRO Previous epoch attestation missing validator: 9, epoch: 65875, service: val_mon Sep 20 06:53:08.239 ERRO Previous epoch attestation missing validator: 10, epoch: 65875, service: val_mon ``` after ``` Sep 20 06:53:08.239 INFO Previous epoch attestation success validators: [1,2,3,4,5,6,7,8,9] , epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Previous epoch attestation failed to match head, validators: [5,6], epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Previous epoch attestation failed to match target, validators: [7,8], epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Previous epoch attestations had sub-optimal inclusion delay, validators: [7,8], epoch: 65875, service: val_mon Sep 20 06:53:08.239 ERRO Previous epoch attestation missing validators: [9,10], epoch: 65875, service: val_mon ``` The detailed individual logs are downgraded to debug logs.	2021-10-12 05:06:48 +00:00
Mac L	a73d698e30	Add TLS capability to the beacon node HTTP API (#2668 ) Currently, the beacon node has no ability to serve the HTTP API over TLS. Adding this functionality would be helpful for certain use cases, such as when you need a validator client to connect to a backup beacon node which is outside your local network, and the use of an SSH tunnel or reverse proxy would be inappropriate. ## Proposed Changes - Add three new CLI flags to the beacon node - `--http-enable-tls`: enables TLS - `--http-tls-cert`: to specify the path to the certificate file - `--http-tls-key`: to specify the path to the key file - Update the HTTP API to optionally use `warp`'s [`TlsServer`](https://docs.rs/warp/0.3.1/warp/struct.TlsServer.html) depending on the presence of the `--http-enable-tls` flag - Update tests and docs - Use a custom branch for `warp` to ensure proper error handling ## Additional Info Serving the API over TLS should currently be considered experimental. The reason for this is that it uses code from an [unmerged PR](https://github.com/seanmonstar/warp/pull/717). This commit provides the `try_bind_with_graceful_shutdown` method to `warp`, which is helpful for controlling error flow when the TLS configuration is invalid (cert/key files don't exist, incorrect permissions, etc). I've implemented the same code in my [branch here](https://github.com/macladson/warp/tree/tls). Once the code has been reviewed and merged upstream into `warp`, we can remove the dependency on my branch and the feature can be considered more stable. Currently, the private key file must not be password-protected in order to be read into Lighthouse.	2021-10-12 03:35:49 +00:00
Age Manning	0aee7ec873	Refactor Peerdb and PeerManager (#2660 ) ## Proposed Changes This is a refactor of the PeerDB and PeerManager. A number of bugs have been surfacing around the connection state of peers and their interaction with the score state. This refactor tightens the mutability properties of peers such that only specific modules are able to modify the state of peer information preventing inadvertant state changes that can lead to our local peer manager db being out of sync with libp2p. Further, the logic around connection and scoring was quite convoluted and the distinction between the PeerManager and Peerdb was not well defined. Although these issues are not fully resolved, this PR is step to cleaning up this logic. The peerdb solely manages most mutability operations of peers leaving high-order logic to the peer manager. A single `update_connection_state()` function has been added to the peer-db making it solely responsible for modifying the peer's connection state. The way the peer's scores can be modified have been reduced to three simple functions (`update_scores()`, `update_gossipsub_scores()` and `report_peer()`). This prevents any add-hoc modifications of scores and only natural processes of score modification is allowed which simplifies the reasoning of score and state changes.	2021-10-11 02:45:06 +00:00
Michael Sproul	708557a473	Fix cargo audit warns for nix, psutil, time (#2699 ) ## Issue Addressed Fix `cargo audit` failures on `unstable` Closes #2698 ## Proposed Changes The main culprit is `nix`, which is vulnerable for versions below v0.23.0. We can't get by with a straight-forward `cargo update` because `psutil` depends on an old version of `nix` (cf. https://github.com/rust-psutil/rust-psutil/pull/93). Hence I've temporarily forked `psutil` under the `sigp` org, where I've included the update to `nix` v0.23.0. Additionally, I took the chance to update the `time` dependency to v0.3, which removed a bunch of stale deps including `stdweb` which is no longer maintained. Lighthouse only uses the `time` crate in the notifier to do some pretty printing, and so wasn't affected by any of the breaking changes in v0.3 ([changelog here](https://github.com/time-rs/time/blob/main/CHANGELOG.md#030-2021-07-30)).	2021-10-11 00:10:35 +00:00
Pawan Dhananjay	7c7ba770de	Update broken api links (#2665 ) ## Issue Addressed Resolves #2563 Replacement for #2653 as I'm not able to reopen that PR after force pushing. ## Proposed Changes Fixes all broken api links. Cherry picked changes in #2590 and updated a few more links. Co-authored-by: Mason Stallmo <masonstallmo@gmail.com>	2021-10-06 00:46:09 +00:00
Pawan Dhananjay	73ec29c267	Don't log errors on resubscription of gossip topics (#2613 ) ## Issue Addressed Resolves #2555 ## Proposed Changes Don't log errors on resubscribing to topics. Also don't log errors if we are setting already set attnet/syncnet bits.	2021-10-06 00:46:08 +00:00
Wink Saville	58870fc6d3	Add test_logger as feature to logging (#2586 ) ## Issue Addressed Fix #2585 ## Proposed Changes Provide a canonical version of test_logger that can be used throughout lighthouse. ## Additional Info This allows tests to conditionally emit logging data by adding test_logger as the default logger. And then when executing `cargo test --features logging/test_logger` log output will be visible: wink@3900x:~/lighthouse/common/logging/tests/test-feature-test_logger (Add-test_logger-as-feature-to-logging) $ cargo test --features logging/test_logger Finished test [unoptimized + debuginfo] target(s) in 0.02s Running unittests (target/debug/deps/test_logger-e20115db6a5e3714) running 1 test Sep 10 12:53:45.212 INFO hi, module: test_logger:8 test tests::test_fn_with_logging ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Doc-tests test-logger running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Or, in normal scenarios where logging isn't needed, executing `cargo test` the log output will not be visible: wink@3900x:~/lighthouse/common/logging/tests/test-feature-test_logger (Add-test_logger-as-feature-to-logging) $ cargo test Finished test [unoptimized + debuginfo] target(s) in 0.02s Running unittests (target/debug/deps/test_logger-02e02f8d41e8cf8a) running 1 test test tests::test_fn_with_logging ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Doc-tests test-logger running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s	2021-10-06 00:46:07 +00:00
Michael Sproul	7c88f582d9	Release v2.0.0 (#2673 ) ## Proposed Changes * Bump version to v2.0.0 * Update dependencies (obsoletes #2670). `tokio-macros` v1.4.0 had been yanked due to a bug.	2021-10-05 03:53:18 +00:00
Michael Sproul	ed1fc7cca6	Fix I/O atomicity issues with checkpoint sync (#2671 ) ## Issue Addressed This PR addresses an issue found by @YorickDowne during testing of v2.0.0-rc.0. Due to a lack of atomic database writes on checkpoint sync start-up, it was possible for the database to get into an inconsistent state from which it couldn't recover without `--purge-db`. The core of the issue was that the store's anchor info was being stored _before_ the `PersistedBeaconChain`. If a crash occured so that anchor info was stored but _not_ the `PersistedBeaconChain`, then on restart Lighthouse would think the database was unitialized and attempt to compare-and-swap a `None` value, but would actually find the stale info from the previous run. ## Proposed Changes The issue is fixed by writing the anchor info, the split point, and the `PersistedBeaconChain` atomically on start-up. Some type-hinting ugliness was required, which could possibly be cleaned up in future refactors.	2021-10-05 03:53:17 +00:00
Kane Wallmann	28b79084cd	Fix chain_id value in config/deposit_contract RPC method (#2659 ) ## Issue Addressed This PR addresses issue #2657 ## Proposed Changes Changes `/eth/v1/config/deposit_contract` endpoint to return the chain ID from the loaded chain spec instead of eth1::DEFAULT_NETWORK_ID which is the Goerli chain ID of 5. Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-10-01 06:32:38 +00:00
Michael Sproul	ea78315749	Release v2.0.0-rc.0 (#2634 ) ## Proposed Changes Cut the first release candidate for v2.0.0, in preparation for testing and release this week ## Additional Info Builds on #2632, which should either be merged first or in the same batch	2021-10-01 01:23:55 +00:00
Age Manning	29a8865d07	Consistent tracking of disconnected peers (#2650 ) ## Issue Addressed N/A ## Proposed Changes When peers switching to a disconnecting state, decrement the disconnected peers counter. This also downgrades some crit logs to errors. I've also added a re-sync point when peers get unbanned the disconnected peer count will match back to the number of disconnected peers if it has gone out of sync previously.	2021-09-30 04:31:43 +00:00
Squirrel	db4d72c4f1	Remove unused deps (#2592 ) Found some deps you're possibly not using. Please shout if you think they are indeed still needed.	2021-09-30 04:31:42 +00:00
Mac L	4c510f8f6b	Add `BlockTimesCache` to allow additional block delay metrics (#2546 ) ## Issue Addressed Closes #2528 ## Proposed Changes - Add `BlockTimesCache` to provide block timing information to `BeaconChain`. This allows additional metrics to be calculated for blocks that are set as head too late. - Thread the `seen_timestamp` of blocks received from RPC responses (except blocks from syncing) through to the sync manager, similar to what is done for blocks from gossip. ## Additional Info This provides the following additional metrics: - `BEACON_BLOCK_OBSERVED_SLOT_START_DELAY_TIME` - The delay between the start of the slot and when the block was first observed. - `BEACON_BLOCK_IMPORTED_OBSERVED_DELAY_TIME` - The delay between when the block was first observed and when the block was imported. - `BEACON_BLOCK_HEAD_IMPORTED_DELAY_TIME` - The delay between when the block was imported and when the block was set as head. The metric `BEACON_BLOCK_IMPORTED_SLOT_START_DELAY_TIME` was removed. A log is produced when a block is set as head too late, e.g.: ``` Aug 27 03:46:39.006 DEBG Delayed head block set_as_head_delay: Some(21.731066ms), imported_delay: Some(119.929934ms), observed_delay: Some(3.864596988s), block_delay: 4.006257988s, slot: 1931331, proposer_index: 24294, block_root: 0x937602c89d3143afa89088a44bdf4b4d0d760dad082abacb229495c048648a9e, service: beacon ```	2021-09-30 04:31:41 +00:00

1 2 3 4 5 ...

1794 Commits