lighthouse

Author	SHA1	Message	Date
Paul Hauner	144978f8f8	Remove duplicate slot_clock method (#2842 )	2021-12-02 14:29:59 +11:00
Paul Hauner	1b56ebf85e	Kintsugi review comments (#2831 ) * Fix makefile * Return on invalid finalized block * Fix todo in gossip scoring * Require --merge for --fee-recipient * Bump eth2_serde_utils * Change schema versions * Swap hash/uint256 test_random impls * Use default for ExecutionPayload::empty * Check for DBs before removing * Remove kintsugi docker image * Fix CLI default value	2021-12-02 14:29:59 +11:00
Paul Hauner	82a81524e3	Bump crate versions (#2829 )	2021-12-02 14:29:57 +11:00
ethDreamer	f6748537db	Removed PowBlock struct that never got used (#2813 )	2021-12-02 14:29:20 +11:00
Paul Hauner	5f0fef2d1e	Kintsugi on_merge_block tests (#2811 ) * Start v1.1.5 updates * Implement new payload creation logic * Tidy, add comments * Remove unused error enums * Add validate payload for gossip * Refactor validate_merge_block * Split payload verification in per block processing * Add execute_payload * Tidy * Tidy * Start working on new fork choice tests * Fix failing merge block test * Skip block_lookup_failed test * Fix failing terminal block test * Fixes from self-review * Address review comments	2021-12-02 14:29:20 +11:00
pawan	44a7b37ce3	Increase network limits (#2796 ) Fix max packet sizes Fix max_payload_size function Add merge block test Fix max size calculation; fix up test Clear comments Add a payload_size_function Use safe arith for payload calculation Return an error if block too big in block production Separate test to check if block is over limit	2021-12-02 14:29:20 +11:00
Paul Hauner	afe59afacd	Ensure difficulty/hash/epoch overrides change the `ChainSpec` (#2798 ) * Unify loading of eth2_network_config * Apply overrides at lighthouse binary level * Remove duplicate override values * Add merge values to existing net configs * Make override flags global * Add merge fields to testing config * Add one to TTD * Fix failing engine tests * Fix test compile error * Remove TTD flags * Move get_eth2_network_config * Fix warn * Address review comments	2021-12-02 14:29:18 +11:00
Paul Hauner	47db682d7e	Implement engine API v1.0.0-alpha.4 (#2810 ) * Added ForkchoiceUpdatedV1 & GetPayloadV1 * Added ExecutePayloadV1 * Added new geth test vectors * Separated Json Object/Serialization Code into file * Deleted code/tests for Requests Removed from spec * Finally fixed serialization of null '0x' * Made Naming of JSON Structs Consistent * Fix clippy lints * Remove u64 payload id * Remove unused serde impls * Swap to [u8; 8] for payload id * Tidy * Adjust some block gen return vals * Tidy * Add fallback when payload id is unknown * Remove comment Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2021-12-02 14:26:55 +11:00
Paul Hauner	cbd2201164	Fixes after rebasing Kintsugi onto unstable (#2799 ) * Fix fork choice after rebase * Remove paulhauner warp dep * Fix fork choice test compile errors * Assume fork choice payloads are valid * Add comment * Ignore new tests * Fix error in test skipping	2021-12-02 14:26:55 +11:00
realbigsean	de49c7ddaa	1.1.5 merge spec tests (#2781 ) * Fix arbitrary check kintsugi * Add merge chain spec fields, and a function to determine which constant to use based on the state variant * increment spec test version * Remove `Transaction` enum wrapper * Remove Transaction new-type * Remove gas validations * Add `--terminal-block-hash-epoch-override` flag * Increment spec tests version to 1.1.5 * Remove extraneous gossip verification https://github.com/ethereum/consensus-specs/pull/2687 * - Remove unused Error variants - Require both "terminal-block-hash-epoch-override" and "terminal-block-hash-override" when either flag is used * - Remove a couple more unused Error variants Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-02 14:26:55 +11:00
realbigsean	d8eec16c5e	v1.1.1 spec updates (#2684 ) * update initializing from eth1 for merge genesis * read execution payload header from file lcli * add `create-payload-header` command to `lcli` * fix base fee parsing * Apply suggestions from code review * default `execution_payload_header` bool to false when deserializing `meta.yml` in EF tests Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-02 14:26:54 +11:00
Paul Hauner	6dde12f311	[Merge] Optimistic Sync: Stage 1 (#2686 ) * Add payload verification status to fork choice * Pass payload verification status to import_block * Add valid back-propagation * Add head safety status latch to API * Remove ExecutionLayerStatus * Add execution info to client notifier * Update notifier logs * Change use of "hash" to refer to beacon block * Shutdown on invalid finalized block * Tidy, add comments * Fix failing FC tests * Allow blocks with unsafe head * Fix forkchoiceUpdate call on startup	2021-12-02 14:26:54 +11:00
Paul Hauner	67a6f91df6	[Merge] Optimistic EL verification (#2683 ) * Ignore payload errors * Only return payload handle on valid response * Push some engine logs down to debug * Push ee fork choice log to debug * Push engine call failure to debug * Push some more errors to debug * Fix panic at startup	2021-12-02 14:26:53 +11:00
Paul Hauner	35350dff75	[Merge] Block validator duties when EL is not ready (#2672 ) * Reject some HTTP endpoints when EL is not ready * Restrict more endpoints * Add watchdog task * Change scheduling * Update to new schedule * Add "syncing" concept * Remove RequireSynced * Add is_merge_complete to head_info * Cache latest_head in Engines * Call consensus_forkchoiceUpdate on startup	2021-12-02 14:26:53 +11:00
Paul Hauner	d6fda44620	Disable notifier logging from dummy eth1 backend (#2680 )	2021-12-02 14:26:53 +11:00
ethDreamer	52e5083502	Fixed bugs for m3 readiness (#2669 ) * Fixed bugs for m3 readiness * woops * cargo fmt..	2021-12-02 14:26:53 +11:00
Paul Hauner	b162b067de	Misc changes for merge testnets (#2667 ) * Thread eth1_block_hash into interop genesis state * Add merge-fork-epoch flag * Build LH with minimal spec by default * Add verbose logs to execution_layer * Add --http-allow-sync-stalled flag * Update lcli new-testnet to create genesis state * Fix http test * Fix compile errors in tests	2021-12-02 14:26:52 +11:00
Paul Hauner	a1033a9247	Add `BeaconChainHarness` tests for The Merge (#2661 ) * Start adding merge tests * Expose MockExecutionLayer * Add mock_execution_layer to BeaconChainHarness * Progress with merge test * Return more detailed errors with gas limit issues * Use a better gas limit in block gen * Ensure TTD is met in block gen * Fix basic_merge tests * Start geth testing * Fix conflicts after rebase * Remove geth tests * Improve merge test * Address clippy lints * Make pow block gen a pure function * Add working new test, breaking existing test * Fix test names * Add should_panic * Don't run merge tests in debug * Detect a tokio runtime when starting MockServer * Fix clippy lint, include merge tests	2021-12-02 14:26:52 +11:00
Paul Hauner	801f6f7425	Disable autotests for beacon_chain (#2658 )	2021-12-02 14:26:52 +11:00
Paul Hauner	d8623cfc4f	[Merge] Implement `execution_layer` (#2635 ) * Checkout serde_utils from rayonism * Make eth1::http functions pub * Add bones of execution_layer * Modify decoding * Expose Transaction, cargo fmt * Add executePayload * Add all minimal spec endpoints * Start adding json rpc wrapper * Finish custom JSON response handler * Switch to new rpc sending method * Add first test * Fix camelCase * Finish adding tests * Begin threading execution layer into BeaconChain * Fix clippy lints * Fix clippy lints * Thread execution layer into ClientBuilder * Add CLI flags * Add block processing methods to ExecutionLayer * Add block_on to execution_layer * Integrate execute_payload * Add extra_data field * Begin implementing payload handle * Send consensus valid/invalid messages * Fix minor type in task_executor * Call forkchoiceUpdated * Add search for TTD block * Thread TTD into execution layer * Allow producing block with execution payload * Add LRU cache for execution blocks * Remove duplicate 0x on ssz_types serialization * Add tests for block getter methods * Add basic block generator impl * Add is_valid_terminal_block to EL * Verify merge block in block_verification * Partially implement --terminal-block-hash-override * Add terminal_block_hash to ChainSpec * Remove Option from terminal_block_hash in EL * Revert merge changes to consensus/fork_choice * Remove commented-out code * Add bones for handling RPC methods on test server * Add first ExecutionLayer tests * Add testing for finding terminal block * Prevent infinite loops * Add insert_merge_block to block gen * Add block gen test for pos blocks * Start adding payloads to block gen * Fix clippy lints * Add execution payload to block gen * Add execute_payload to block_gen * Refactor block gen * Add all routes to mock server * Use Uint256 for base_fee_per_gas * Add working execution chain build * Remove unused var * Revert "Use Uint256 for base_fee_per_gas" This reverts commit 6c88f19ac45db834dd4dbf7a3c6e7242c1c0f735. * Fix base_fee_for_gas Uint256 * Update execute payload handle * Improve testing, fix bugs * Fix default fee-recipient * Fix fee-recipient address (again) * Add check for terminal block, add comments, tidy * Apply suggestions from code review Co-authored-by: realbigsean <seananderson33@GMAIL.com> * Fix is_none on handle Drop * Remove commented-out tests Co-authored-by: realbigsean <seananderson33@GMAIL.com>	2021-12-02 14:26:51 +11:00
ethDreamer	1563bce905	Finished Gossip Block Validation Conditions (#2640 ) * Gossip Block Validation is Much More Efficient Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-02 14:26:51 +11:00
realbigsean	aa534f8989	Store execution block hash in fork choice (#2643 ) * - Update the fork choice `ProtoNode` to include `is_merge_complete` - Add database migration for the persisted fork choice * update tests * Small cleanup * lints * store execution block hash in fork choice rather than bool	2021-12-02 14:26:51 +11:00
Paul Hauner	c10e8ce955	Fix clippy lints on merge-f2f (#2626 ) * Remove unchecked arith from ssz_derive * Address clippy lints in block_verfication * Use safe math for is_valid_gas_limit	2021-12-02 14:26:50 +11:00
Mark Mackey	5687c56d51	Initial merge changes Added Execution Payload from Rayonism Fork Updated new Containers to match Merge Spec Updated BeaconBlockBody for Merge Spec Completed updating BeaconState and BeaconBlockBody Modified ExecutionPayload<T> to use Transaction<T> Mostly Finished Changes for beacon-chain.md Added some things for fork-choice.md Update to match new fork-choice.md/fork.md changes ran cargo fmt Added Missing Pieces in eth2_libp2p for Merge fix ef test Various Changes to Conform Closer to Merge Spec	2021-12-02 14:26:50 +11:00
Mac L	fe75a0a9a1	Add background file logging (#2762 ) ## Issue Addressed Closes #1996 ## Proposed Changes Run a second `Logger` via `sloggers` which logs to a file in the background with: - separate `debug-level` for background and terminal logging - the ability to limit log size - rotation through a customizable number of log files - an option to compress old log files (`.gz` format) Add the following new CLI flags: - `--logfile-debug-level`: The debug level of the log files - `--logfile-max-size`: The maximum size of each log file - `--logfile-max-number`: The number of old log files to store - `--logfile-compress`: Whether to compress old log files By default background logging uses the `debug` log level and saves logfiles to: - Beacon Node: `$HOME/.lighthouse/$network/beacon/logs/beacon.log` - Validator Client: `$HOME/.lighthouse/$network/validators/logs/validator.log` Or, when using the `--datadir` flag: `$datadir/beacon/logs/beacon.log` and `$datadir/validators/logs/validator.log` Once rotated, old logs are stored like so: `beacon.log.1`, `beacon.log.2` etc. > Note: `beacon.log.1` is always newer than `beacon.log.2`. ## Additional Info Currently the default value of `--logfile-max-size` is 200 (MB) and `--logfile-max-number` is 5. This means that the maximum storage space that the logs will take up by default is 1.2GB. (200MB x 5 from old log files + <200MB the current logfile being written to) Happy to adjust these default values to whatever people think is appropriate. It's also worth noting that when logging to a file, we lose our custom `slog` formatting. This means the logfile logs look like this: ``` Oct 27 16:02:50.305 INFO Lighthouse started, version: Lighthouse/v2.0.1-8edd9d4+, module: lighthouse:413 Oct 27 16:02:50.305 INFO Configured for network, name: prater, module: lighthouse:414 ```	2021-11-30 03:25:32 +00:00
Age Manning	e519af9012	Update Lighthouse Dependencies (#2818 ) ## Issue Addressed Updates lighthouse dependencies to resolve audit issues in out-dated deps.	2021-11-18 05:08:42 +00:00
Paul Hauner	931daa40d7	Add fork choice EF tests (#2737 ) ## Issue Addressed Resolves #2545 ## Proposed Changes Adds the long-overdue EF tests for fork choice. Although we had pretty good coverage via other implementations that closely followed our approach, it is nonetheless important for us to implement these tests too. During testing I found that we were using a hard-coded `SAFE_SLOTS_TO_UPDATE_JUSTIFIED` value rather than one from the `ChainSpec`. This caused a failure during a minimal preset test. This doesn't represent a risk to mainnet or testnets, since the hard-coded value matched the mainnet preset. ## Failing Cases There is one failing case which is presently marked as `SkippedKnownFailure`: ``` case 4 ("new_finalized_slot_is_justified_checkpoint_ancestor") from /home/paul/development/lighthouse/testing/ef_tests/consensus-spec-tests/tests/minimal/phase0/fork_choice/on_block/pyspec_tests/new_finalized_slot_is_justified_checkpoint_ancestor failed with NotEqual: head check failed: Got Head { slot: Slot(40), root: 0x9183dbaed4191a862bd307d476e687277fc08469fc38618699863333487703e7 } \| Expected Head { slot: Slot(24), root: 0x105b49b51bf7103c182aa58860b039550a89c05a4675992e2af703bd02c84570 } ``` This failure is due to #2741. It's not a particularly high priority issue at the moment, so we fix it after merging this PR.	2021-11-08 07:29:04 +00:00
realbigsean	c4ad0e3fb3	Ensure dependent root consistency in head events (#2753 ) ## Issue Addressed @paulhauner noticed that when we send head events, we use the block root from `new_head` in `fork_choice_internal`, but calculate `dependent_root` and `previous_dependent_root` using the `canonical_head`. This is normally fine because `new_head` updates the `canonical_head` in `fork_choice_internal`, but it's possible we have a reorg updating `canonical_head` before our head events are sent. So this PR ensures `dependent_root` and `previous_dependent_root` are always derived from the state associated with `new_head`. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-11-02 02:26:32 +00:00
Pawan Dhananjay	4499adc7fd	Check proposer index during block production (#2740 ) ## Issue Addressed Resolves #2612 ## Proposed Changes Implements both the checks mentioned in the original issue. 1. Verifies the `randao_reveal` in the beacon node 2. Cross checks the proposer index after getting back the block from the beacon node. ## Additional info The block production time increases by ~10x because of the signature verification on the beacon node (based on the `beacon_block_production_process_seconds` metric) when running on a local testnet.	2021-11-01 07:44:40 +00:00
Michael Sproul	ffb04e1a9e	Add op pool metrics for attestations (#2758 ) ## Proposed Changes Add several metrics for the number of attestations in the op pool. These give us a way to observe the number of valid, non-trivial attestations during block packing rather than just the size of the entire op pool.	2021-11-01 05:52:31 +00:00
Michael Sproul	d2e3d4c6f1	Add flag to disable lock timeouts (#2714 ) ## Issue Addressed Mitigates #1096 ## Proposed Changes Add a flag to the beacon node called `--disable-lock-timeouts` which allows opting out of lock timeouts. The lock timeouts serve a dual purpose: 1. They prevent any single operation from hogging the lock for too long. When a timeout occurs it logs a nasty error which indicates that there's suboptimal lock use occurring, which we can then act on. 2. They allow deadlock detection. We're fairly sure there are no deadlocks left in Lighthouse anymore but the timeout locks offer a safeguard against that. However, timeouts on locks are not without downsides: They allow for the possibility of livelock, particularly on slower hardware. If lock timeouts keep failing spuriously the node can be prevented from making any progress, even if it would be able to make progress slowly without the timeout. One particularly concerning scenario which could occur would be if a DoS attack succeeded in slowing block signature verification times across the network, and all Lighthouse nodes got livelocked because they timed out repeatedly. This could also occur on just a subset of nodes (e.g. dual core VPSs or Raspberri Pis). By making the behaviour runtime configurable this PR allows us to choose the behaviour we want depending on circumstance. I suspect that long term we could make the timeout-free approach the default (#2381 moves in this direction) and just enable the timeouts on our testnet nodes for debugging purposes. This PR conservatively leaves the default as-is so we can gain some more experience before switching the default.	2021-10-19 00:30:40 +00:00
Paul Hauner	a7b675460d	Add Altair tests to op pool (#2723 ) ## Issue Addressed NA ## Proposed Changes Adds some more testing for Altair to the op pool. Credits to @michaelsproul for some appropriated efforts here. ## Additional Info NA Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-10-16 05:07:23 +00:00
Michael Sproul	5cde3fc4da	Reduce lock contention in backfill sync (#2716 ) ## Proposed Changes Clone the proposer pubkeys during backfill signature verification to reduce the time that the pubkey cache lock is held for. Cloning such a small number of pubkeys has negligible impact on the total running time, but greatly reduces lock contention. On a Ryzen 5950X, the setup step seems to take around 180us regardless of whether the key is cloned or not, while the verification takes 7ms. When Lighthouse is limited to 10% of one core using `sudo cpulimit --pid <pid> --limit 10` the total time jumps up to 800ms, but the setup step remains only 250us. This means that under heavy load this PR could cut the time the lock is held for from 800ms to 250us, which is a huge saving of 99.97%!	2021-10-15 03:28:03 +00:00
Paul Hauner	e2d09bb8ac	Add `BeaconChainHarness::builder` (#2707 ) ## Issue Addressed NA ## Proposed Changes This PR is near-identical to https://github.com/sigp/lighthouse/pull/2652, however it is to be merged into `unstable` instead of `merge-f2f`. Please see that PR for reasoning. I'm making this duplicate PR to merge to `unstable` in an effort to shrink the diff between `unstable` and `merge-f2f` by doing smaller, lead-up PRs. ## Additional Info NA	2021-10-14 02:58:10 +00:00
Pawan Dhananjay	34d22b5920	Reduce validator monitor logging verbosity (#2606 ) ## Issue Addressed Resolves #2541 ## Proposed Changes Reduces verbosity of validator monitor per epoch logging by batching info logs for multiple validators. Instead of a log for every validator managed by the validator monitor, we now batch logs for attestation records for previous epoch. Before: ```log Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 1, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 2, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 3, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 4, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 5, epoch: 65875, matched_head: false, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match head validator: 5, epoch: 65875, service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 6, epoch: 65875, matched_head: false, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match head validator: 6, epoch: 65875, service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 7, epoch: 65875, matched_head: true, matched_target: false, inclusion_lag: 1 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match target validator: 7, epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Sub-optimal inclusion delay validator: 7, epoch: 65875, optimal: 1, delay: 2, service: val_mon Sep 20 06:53:08.239 INFO Previous epoch attestation success validator: 8, epoch: 65875, matched_head: true, matched_target: false, inclusion_lag: 1 slot(s), service: val_mon Sep 20 06:53:08.239 WARN Attestation failed to match target validator: 8, epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Sub-optimal inclusion delay validator: 8, epoch: 65875, optimal: 1, delay: 2, service: val_mon Sep 20 06:53:08.239 ERRO Previous epoch attestation missing validator: 9, epoch: 65875, service: val_mon Sep 20 06:53:08.239 ERRO Previous epoch attestation missing validator: 10, epoch: 65875, service: val_mon ``` after ``` Sep 20 06:53:08.239 INFO Previous epoch attestation success validators: [1,2,3,4,5,6,7,8,9] , epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Previous epoch attestation failed to match head, validators: [5,6], epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Previous epoch attestation failed to match target, validators: [7,8], epoch: 65875, service: val_mon Sep 20 06:53:08.239 WARN Previous epoch attestations had sub-optimal inclusion delay, validators: [7,8], epoch: 65875, service: val_mon Sep 20 06:53:08.239 ERRO Previous epoch attestation missing validators: [9,10], epoch: 65875, service: val_mon ``` The detailed individual logs are downgraded to debug logs.	2021-10-12 05:06:48 +00:00
Wink Saville	58870fc6d3	Add test_logger as feature to logging (#2586 ) ## Issue Addressed Fix #2585 ## Proposed Changes Provide a canonical version of test_logger that can be used throughout lighthouse. ## Additional Info This allows tests to conditionally emit logging data by adding test_logger as the default logger. And then when executing `cargo test --features logging/test_logger` log output will be visible: wink@3900x:~/lighthouse/common/logging/tests/test-feature-test_logger (Add-test_logger-as-feature-to-logging) $ cargo test --features logging/test_logger Finished test [unoptimized + debuginfo] target(s) in 0.02s Running unittests (target/debug/deps/test_logger-e20115db6a5e3714) running 1 test Sep 10 12:53:45.212 INFO hi, module: test_logger:8 test tests::test_fn_with_logging ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Doc-tests test-logger running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Or, in normal scenarios where logging isn't needed, executing `cargo test` the log output will not be visible: wink@3900x:~/lighthouse/common/logging/tests/test-feature-test_logger (Add-test_logger-as-feature-to-logging) $ cargo test Finished test [unoptimized + debuginfo] target(s) in 0.02s Running unittests (target/debug/deps/test_logger-02e02f8d41e8cf8a) running 1 test test tests::test_fn_with_logging ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Doc-tests test-logger running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s	2021-10-06 00:46:07 +00:00
Michael Sproul	ed1fc7cca6	Fix I/O atomicity issues with checkpoint sync (#2671 ) ## Issue Addressed This PR addresses an issue found by @YorickDowne during testing of v2.0.0-rc.0. Due to a lack of atomic database writes on checkpoint sync start-up, it was possible for the database to get into an inconsistent state from which it couldn't recover without `--purge-db`. The core of the issue was that the store's anchor info was being stored _before_ the `PersistedBeaconChain`. If a crash occured so that anchor info was stored but _not_ the `PersistedBeaconChain`, then on restart Lighthouse would think the database was unitialized and attempt to compare-and-swap a `None` value, but would actually find the stale info from the previous run. ## Proposed Changes The issue is fixed by writing the anchor info, the split point, and the `PersistedBeaconChain` atomically on start-up. Some type-hinting ugliness was required, which could possibly be cleaned up in future refactors.	2021-10-05 03:53:17 +00:00
Squirrel	db4d72c4f1	Remove unused deps (#2592 ) Found some deps you're possibly not using. Please shout if you think they are indeed still needed.	2021-09-30 04:31:42 +00:00
Mac L	4c510f8f6b	Add `BlockTimesCache` to allow additional block delay metrics (#2546 ) ## Issue Addressed Closes #2528 ## Proposed Changes - Add `BlockTimesCache` to provide block timing information to `BeaconChain`. This allows additional metrics to be calculated for blocks that are set as head too late. - Thread the `seen_timestamp` of blocks received from RPC responses (except blocks from syncing) through to the sync manager, similar to what is done for blocks from gossip. ## Additional Info This provides the following additional metrics: - `BEACON_BLOCK_OBSERVED_SLOT_START_DELAY_TIME` - The delay between the start of the slot and when the block was first observed. - `BEACON_BLOCK_IMPORTED_OBSERVED_DELAY_TIME` - The delay between when the block was first observed and when the block was imported. - `BEACON_BLOCK_HEAD_IMPORTED_DELAY_TIME` - The delay between when the block was imported and when the block was set as head. The metric `BEACON_BLOCK_IMPORTED_SLOT_START_DELAY_TIME` was removed. A log is produced when a block is set as head too late, e.g.: ``` Aug 27 03:46:39.006 DEBG Delayed head block set_as_head_delay: Some(21.731066ms), imported_delay: Some(119.929934ms), observed_delay: Some(3.864596988s), block_delay: 4.006257988s, slot: 1931331, proposer_index: 24294, block_root: 0x937602c89d3143afa89088a44bdf4b4d0d760dad082abacb229495c048648a9e, service: beacon ```	2021-09-30 04:31:41 +00:00
Pawan Dhananjay	70441aa554	Improve valmon inclusion delay calculation (#2618 ) ## Issue Addressed Resolves #2552 ## Proposed Changes Offers some improvement in inclusion distance calculation in the validator monitor. When registering an attestation from a block, instead of doing `block.slot() - attesstation.data.slot()` to get the inclusion distance, we now pass the parent block slot from the beacon chain and do `parent_slot.saturating_sub(attestation.data.slot())`. This allows us to give best effort inclusion distance in scenarios where the attestation was included right after a skip slot. Note that this does not give accurate results in scenarios where the attestation was included few blocks after the skip slot. In this case, if the attestation slot was `b1` and was included in block `b2` with a skip slot in between, we would get the inclusion delay as 0 (by ignoring the skip slot) which is the best effort inclusion delay. ``` b1 <- missed <- b2 ``` Here, if the attestation slot was `b1` and was included in block `b3` with a skip slot and valid block `b2` in between, then we would get the inclusion delay as 2 instead of 1 (by ignoring the skip slot). ``` b1 <- missed <- b2 <- b3 ``` A solution for the scenario 2 would be to count number of slots between included slot and attestation slot ignoring the skip slots in the beacon chain and pass the value to the validator monitor. But I'm concerned that it could potentially lead to db accesses for older blocks in extreme cases. This PR also uses the validator monitor data for logging per epoch inclusion distance. This is useful as we won't get inclusion data in post-altair summaries. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2021-09-30 01:22:43 +00:00
realbigsean	7d13e57d9f	Add interop metrics (#2645 ) ## Issue Addressed Resolves: #2644 ## Proposed Changes - Adds mandatory metrics mentioned here: https://github.com/ethereum/beacon-metrics/blob/master/metrics.md#interop-metrics ## Additional Info Couldn't figure out how to alias metrics, so I created them all as new gauges/counters. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-09-29 23:44:24 +00:00
realbigsean	113ef74ef6	Add contribution and proof event (#2527 ) ## Issue Addressed N/A ## Proposed Changes Add the new ContributionAndProof event: https://github.com/ethereum/beacon-APIs/pull/158 ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-09-25 07:53:58 +00:00
Paul Hauner	fe52322088	Implement SSZ union type (#2579 ) ## Issue Addressed NA ## Proposed Changes Implements the "union" type from the SSZ spec for `ssz`, `ssz_derive`, `tree_hash` and `tree_hash_derive` so it may be derived for `enums`: https://github.com/ethereum/consensus-specs/blob/v1.1.0-beta.3/ssz/simple-serialize.md#union The union type is required for the merge, since the `Transaction` type is defined as a single-variant union `Union[OpaqueTransaction]`. ### Crate Updates This PR will (hopefully) cause CI to publish new versions for the following crates: - `eth2_ssz_derive`: `0.2.1` -> `0.3.0` - `eth2_ssz`: `0.3.0` -> `0.4.0` - `eth2_ssz_types`: `0.2.0` -> `0.2.1` - `tree_hash`: `0.3.0` -> `0.4.0` - `tree_hash_derive`: `0.3.0` -> `0.4.0` These these crates depend on each other, I've had to add a workspace-level `[patch]` for these crates. A follow-up PR will need to remove this patch, ones the new versions are published. ### Union Behaviors We already had SSZ `Encode` and `TreeHash` derive for enums, however it just did a "transparent" pass-through of the inner value. Since the "union" decoding from the spec is in conflict with the transparent method, I've required that all `enum` have exactly one of the following enum-level attributes: #### SSZ - `#[ssz(enum_behaviour = "union")]` - matches the spec used for the merge - `#[ssz(enum_behaviour = "transparent")]` - maintains existing functionality - not supported for `Decode` (never was) #### TreeHash - `#[tree_hash(enum_behaviour = "union")]` - matches the spec used for the merge - `#[tree_hash(enum_behaviour = "transparent")]` - maintains existing functionality This means that we can maintain the existing transparent behaviour, but all existing users will get a compile-time error until they explicitly opt-in to being transparent. ### Legacy Option Encoding Before this PR, we already had a union-esque encoding for `Option<T>`. However, this was with the old SSZ spec where the union selector was 4 bytes. During merge specification, the spec was changed to use 1 byte for the selector. Whilst the 4-byte `Option` encoding was never used in the spec, we used it in our database. Writing a migrate script for all occurrences of `Option` in the database would be painful, especially since it's used in the `CommitteeCache`. To avoid the migrate script, I added a serde-esque `#[ssz(with = "module")]` field-level attribute to `ssz_derive` so that we can opt into the 4-byte encoding on a field-by-field basis. The `ssz::legacy::four_byte_impl!` macro allows a one-liner to define the module required for the `#[ssz(with = "module")]` for some `Option<T> where T: Encode + Decode`. Notably, I have removed `Encode` and `Decode` impls for `Option`. I've done this to force a break on downstream users. Like I mentioned, `Option` isn't used in the spec so I don't think it'll be that annoying. I think it's nicer than quietly having two different union implementations or quietly breaking the existing `Option` impl. ### Crate Publish Ordering I've modified the order in which CI publishes crates to ensure that we don't publish a crate without ensuring we already published a crate that it depends upon. ## TODO - [ ] Queue a follow-up `[patch]`-removing PR.	2021-09-25 05:58:36 +00:00
Paul Hauner	be11437c27	Batch BLS verification for attestations (#2399 ) ## Issue Addressed NA ## Proposed Changes Adds the ability to verify batches of aggregated/unaggregated attestations from the network. When the `BeaconProcessor` finds there are messages in the aggregated or unaggregated attestation queues, it will first check the length of the queue: - `== 1` verify the attestation individually. - `>= 2` take up to 64 of those attestations and verify them in a batch. Notably, we only perform batch verification if the queue has a backlog. We don't apply any artificial delays to attestations to try and force them into batches. ### Batching Details To assist with implementing batches we modify `beacon_chain::attestation_verification` to have two distinct categories for attestations: - Indexed attestations: those which have passed initial validation and were valid enough for us to derive an `IndexedAttestation`. - Verified attestations: those attestations which were indexed and also passed signature verification. These are well-formed, interesting messages which were signed by validators. The batching functions accept `n` attestations and then return `n` attestation verification `Result`s, where those `Result`s can be any combination of `Ok` or `Err`. In other words, we attempt to verify as many attestations as possible and return specific per-attestation results so peer scores can be updated, if required. When we batch verify attestations, we first try to map all those attestations to indexed attestations. If any of those attestations were able to be indexed, we then perform batch BLS verification on those indexed attestations. If the batch verification succeeds, we convert them into verified attestations, disabling individual signature checking. If the batch fails, we convert to verified attestations with individual signature checking enabled. Ultimately, we optimistically try to do a batch verification of attestation signatures and fall-back to individual verification if it fails. This opens an attach vector for "poisoning" the attestations and causing us to waste a batch verification. I argue that peer scoring should do a good-enough job of defending against this and the typical-case gains massively outweigh the worst-case losses. ## Additional Info Before this PR, attestation verification took the attestations by value (instead of by reference). It turns out that this was unnecessary and, in my opinion, resulted in some undesirable ergonomics (e.g., we had to pass the attestation back in the `Err` variant to avoid clones). In this PR I've modified attestation verification so that it now takes a reference. I refactored the `beacon_chain/tests/attestation_verification.rs` tests so they use a builder-esque "tester" struct instead of a weird macro. It made it easier for me to test individual/batch with the same set of tests and I think it was a nice tidy-up. Notably, I did this last to try and make sure my new refactors to actual production code would pass under the existing test suite.	2021-09-22 08:49:41 +00:00
Michael Sproul	9667dc2f03	Implement checkpoint sync (#2244 ) ## Issue Addressed Closes #1891 Closes #1784 ## Proposed Changes Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint. ## Additional Info - [x] Return unavailable status for out-of-range blocks requested by peers (#2561) - [x] Implement sync daemon for fetching historical blocks (#2561) - [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module) - [x] Consistency check for initial block + state - [x] Fetch the initial state and block from a beacon node HTTP endpoint - [x] Don't crash fetching beacon states by slot from the API - [x] Background service for state reconstruction, triggered by CLI flag or API call. Considered out of scope for this PR: - Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification) Co-authored-by: Diva M <divma@protonmail.com>	2021-09-22 00:37:28 +00:00
Wink Saville	4755d4b236	Update sloggers to v2.0.2 (#2588 ) fixes #2584	2021-09-14 06:48:26 +00:00
realbigsean	50321c6671	Updates to make crates publishable (#2472 ) ## Issue Addressed Related to: #2259 Made an attempt at all the necessary updates here to publish the crates to crates.io. I incremented the minor versions on all the crates that have been previously published. We still might run into some issues as we try to publish because I'm not able to test this out but I think it's a good starting point. ## Proposed Changes - Add description and license to `ssz_types` and `serde_util` - rename `serde_util` to `eth2_serde_util` - increment minor versions - remove path dependencies - remove patch dependencies ## Additional Info Crates published: - [x] `tree_hash` -- need to publish `tree_hash_derive` and `eth2_hashing` first - [x] `eth2_ssz_types` -- need to publish `eth2_serde_util` first - [x] `tree_hash_derive` - [x] `eth2_ssz` - [x] `eth2_ssz_derive` - [x] `eth2_serde_util` - [x] `eth2_hashing` Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-09-03 01:10:25 +00:00
Pawan Dhananjay	5a3bcd2904	Validator monitor support for sync committees (#2476 ) ## Issue Addressed N/A ## Proposed Changes Add functionality in the validator monitor to provide sync committee related metrics for monitored validators. Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-08-31 23:31:36 +00:00
Paul Hauner	44fa54004c	Persist to DB after setting canonical head (#2547 ) ## Issue Addressed NA ## Proposed Changes Missed head votes on attestations is a well-known issue. The primary cause is a block getting set as the head after the attestation deadline. This PR aims to shorten the overall time between "block received" and "block set as head" by: 1. Persisting the head and fork choice after setting the canonical head - Informal measurements show this takes ~200ms 1. Pruning the op pool after setting the canonical head. 1. No longer persisting the op pool to disk during `BeaconChain::fork_choice` - Informal measurements show this can take up to 1.2s. I also add some metrics to help measure the effect of these changes. Persistence changes like this run the risk of breaking assumptions downstream. However, I have considered these risks and I think we're fine here. I will describe my reasoning for each change. ## Reasoning ### Change 1: Persisting the head and fork choice after setting the canonical head For (1), although the function is called `persist_head_and_fork_choice`, it only persists: - Fork choice - Head tracker - Genesis block root Since `BeaconChain::fork_choice_internal` does not modify these values between the original time we were persisting it and the current time, I assert that the change I've made is non-substantial in terms of what ends up on-disk. There's the possibility that some other thread has modified fork choice in the extra time we've given it, but that's totally fine. Since the only time we read those values from disk is during startup, I assert that this has no impact during runtime. ### Change 2: Pruning the op pool after setting the canonical head Similar to the argument above, we don't modify the op pool during `BeaconChain::fork_choice_internal` so it shouldn't matter when we prune. This change should be non-substantial. ### Change 3: No longer persisting the op pool to disk during `BeaconChain::fork_choice` This change is substantial. With the proposed changes, we'll only be persisting the op pool to disk when we shut down cleanly (i.e., the `BeaconChain` gets dropped). This means we'll save disk IO and time during usual operation, but a `kill -9` or similar "crash" will probably result in an out-of-date op pool when we reboot. An out-of-date op pool can only have an impact when producing blocks or aggregate attestations/sync committees. I think it's pretty reasonable that a crash might result in an out-of-date op pool, since: - Crashes are fairly rare. Practically the only time I see LH suffer a full crash is when the OOM killer shows up, and that's a very serious event. - It's generally quite rare to produce a block/aggregate immediately after a reboot. Just a few slots of runtime is probably enough to have a decent-enough op pool again. ## Additional Info Credits to @macladson for the timings referenced here.	2021-08-31 04:48:21 +00:00
Michael Sproul	10945e0619	Revert bad blocks on missed fork (#2529 ) ## Issue Addressed Closes #2526 ## Proposed Changes If the head block fails to decode on start up, do two things: 1. Revert all blocks between the head and the most recent hard fork (to `fork_slot - 1`). 2. Reset fork choice so that it contains the new head, and all blocks back to the new head's finalized checkpoint. ## Additional Info I tweaked some of the beacon chain test harness stuff in order to make it generic enough to test with a non-zero slot clock on start-up. In the process I consolidated all the various `new_` methods into a single generic one which will hopefully serve all future uses 🤞	2021-08-30 06:41:31 +00:00

1 2 3 4 5 ...

710 Commits