lighthouse

Author	SHA1	Message	Date
realbigsean	cae40731a2	Strict count unrealized (#3522 ) ## Issue Addressed Add a flag that can increase count unrealized strictness, defaults to false ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: sean <seananderson33@gmail.com>	2022-09-05 04:50:47 +00:00
Paul Hauner	8609cced0e	Reset payload statuses when resuming fork choice (#3498 ) ## Issue Addressed NA ## Proposed Changes This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot. The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around. ## Additional Info This implementation has the following intricacies: 1. Instead of just resetting invalid payloads to optimistic, we'll also reset valid payloads. This is an artifact of our existing implementation. 1. We will only reset payload statuses when we detect an invalid payload present in `proto_array` - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks. 1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that does not have reverted payload statuses. - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.	2022-08-29 14:34:41 +00:00
Michael Sproul	66eca1a882	Refactor op pool for speed and correctness (#3312 ) ## Proposed Changes This PR has two aims: to speed up attestation packing in the op pool, and to fix bugs in the verification of attester slashings, proposer slashings and voluntary exits. The changes are bundled into a single database schema upgrade (v12). Attestation packing is sped up by removing several inefficiencies: - No more recalculation of `attesting_indices` during packing. - No (unnecessary) examination of the `ParticipationFlags`: a bitfield suffices. See `RewardCache`. - No re-checking of attestation validity during packing: the `AttestationMap` provides attestations which are "correct by construction" (I have checked this using Hydra). - No SSZ re-serialization for the clunky `AttestationId` type (it can be removed in a future release). So far the speed-up seems to be roughly 2-10x, from 500ms down to 50-100ms. Verification of attester slashings, proposer slashings and voluntary exits is fixed by: - Tracking the `ForkVersion`s that were used to verify each message inside the `SigVerifiedOp`. This allows us to quickly re-verify that they match the head state's opinion of what the `ForkVersion` should be at the epoch(s) relevant to the message. - Storing the `SigVerifiedOp` on disk rather than the raw operation. This allows us to continue track the fork versions after a reboot. This is mostly contained in this commit 52bb1840ae5c4356a8fc3a51e5df23ed65ed2c7f. ## Additional Info The schema upgrade uses the justified state to re-verify attestations and compute `attesting_indices` for them. It will drop any attestations that fail to verify, by the logic that attestations are most valuable in the few slots after they're observed, and are probably stale and useless by the time a node restarts. Exits and proposer slashings and similarly re-verified to obtain `SigVerifiedOp`s. This PR contains a runtime killswitch `--paranoid-block-proposal` which opts out of all the optimisations in favour of closely verifying every included message. Although I'm quite sure that the optimisations are correct this flag could be useful in the event of an unforeseen emergency. Finally, you might notice that the `RewardCache` appears quite useless in its current form because it is only updated on the hot-path immediately before proposal. My hope is that in future we can shift calls to `RewardCache::update` into the background, e.g. while performing the state advance. It is also forward-looking to `tree-states` compatibility, where iterating and indexing `state.{previous,current}_epoch_participation` is expensive and needs to be minimised.	2022-08-29 09:10:26 +00:00
Michael Sproul	aab4a8d2f2	Update docs for mainnet merge release (#3494 ) ## Proposed Changes Update the merge migration docs to encourage updating mainnet configs _now_! The docs are also updated to recommend _against_ `--suggested-fee-recipient` on the beacon node (https://github.com/sigp/lighthouse/issues/3432). Additionally the `--help` for the CLI is updated to match with a few small semantic changes: - `--execution-jwt` is no longer allowed without `--execution-endpoint`. We've ended up without a default for `--execution-endpoint`, so I think that's fine. - The flags related to the JWT are only allowed if `--execution-jwt` is provided.	2022-08-23 03:50:58 +00:00
Michael Sproul	92d597ad23	Modularise slasher backend (#3443 ) ## Proposed Changes Enable multiple database backends for the slasher, either MDBX (default) or LMDB. The backend can be selected using `--slasher-backend={lmdb,mdbx}`. ## Additional Info In order to abstract over the two library's different handling of database lifetimes I've used `Box::leak` to give the `Environment` type a `'static` lifetime. This was the only way I could think of using 100% safe code to construct a self-referential struct `SlasherDB`, where the `OpenDatabases` refers to the `Environment`. I think this is OK, as the `Environment` is expected to live for the life of the program, and both database engines leave the database in a consistent state after each write. The memory claimed for memory-mapping will be freed by the OS and appropriately flushed regardless of whether the `Environment` is actually dropped. We are depending on two `sigp` forks of `libmdbx-rs` and `lmdb-rs`, to give us greater control over MDBX OS support and LMDB's version.	2022-08-15 01:30:56 +00:00
Michael Sproul	4e05f19fb5	Serve Bellatrix preset in BN API (#3425 ) ## Issue Addressed Resolves #3388 Resolves #2638 ## Proposed Changes - Return the `BellatrixPreset` on `/eth/v1/config/spec` by default. - Allow users to opt out of this by providing `--http-spec-fork=altair` (unless there's a Bellatrix fork epoch set). - Add the Altair constants from #2638 and make serving the constants non-optional (the `http-disable-legacy-spec` flag is deprecated). - Modify the VC to only read the `Config` and not to log extra fields. This prevents it from having to muck around parsing the `ConfigAndPreset` fields it doesn't need. ## Additional Info This change is backwards-compatible for the VC and the BN, but is marked as a breaking change for the removal of `--http-disable-legacy-spec`. I tried making `Config` a `superstruct` too, but getting the automatic decoding to work was a huge pain and was going to require a lot of hacks, so I gave up in favour of keeping the default-based approach we have now.	2022-08-10 07:52:59 +00:00
Justin Traglia	807bc8b0b3	Fix a few typos in option help strings (#3401 ) ## Proposed Changes Fixes a typo I noticed while looking at options.	2022-08-02 00:58:24 +00:00
Michael Sproul	fdfdb9b57c	Enable `count-unrealized` by default (#3389 ) ## Issue Addressed Enable https://github.com/sigp/lighthouse/pull/3322 by default on all networks. The feature can be opted out of using `--count-unrealized=false` (the CLI flag is updated to take a parameter).	2022-07-30 00:22:41 +00:00
realbigsean	6c2d8b2262	Builder Specs v0.2.0 (#3134 ) ## Issue Addressed https://github.com/sigp/lighthouse/issues/3091 Extends https://github.com/sigp/lighthouse/pull/3062, adding pre-bellatrix block support on blinded endpoints and allowing the normal proposal flow (local payload construction) on blinded endpoints. This resulted in better fallback logic because the VC will not have to switch endpoints on failure in the BN <> Builder API, the BN can just fallback immediately and without repeating block processing that it shouldn't need to. We can also keep VC fallback from the VC<>BN API's blinded endpoint to full endpoint. ## Proposed Changes - Pre-bellatrix blocks on blinded endpoints - Add a new `PayloadCache` to the execution layer - Better fallback-from-builder logic ## Todos - [x] Remove VC transition logic - [x] Add logic to only enable builder flow after Merge transition finalization - [x] Tests - [x] Fix metrics - [x] Rustdocs Co-authored-by: Mac L <mjladson@pm.me> Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-07-30 00:22:37 +00:00
realbigsean	20ebf1f3c1	Realized unrealized experimentation (#3322 ) ## Issue Addressed Add a flag that optionally enables unrealized vote tracking. Would like to test out on testnets and benchmark differences in methods of vote tracking. This PR includes a DB schema upgrade to enable to new vote tracking style. Co-authored-by: realbigsean <sean@sigmaprime.io> Co-authored-by: Paul Hauner <paul@paulhauner.com> Co-authored-by: sean <seananderson33@gmail.com> Co-authored-by: Mac L <mjladson@pm.me>	2022-07-25 23:53:26 +00:00
realbigsean	a7da0677d5	Remove builder redundancy (#3294 ) ## Issue Addressed This PR is a subset of the changes in #3134. Unstable will still not function correctly with the new builder spec once this is merged, #3134 should be used on testnets ## Proposed Changes - Removes redundancy in "builders" (servers implementing the builder spec) - Renames `payload-builder` flag to `builder` - Moves from old builder RPC API to new HTTP API, but does not implement the validator registration API (implemented in https://github.com/sigp/lighthouse/pull/3194) Co-authored-by: sean <seananderson33@gmail.com> Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-07-01 01:15:19 +00:00
Pawan Dhananjay	5de00b7ee8	Unify execution layer endpoints (#3214 ) ## Issue Addressed Resolves #3069 ## Proposed Changes Unify the `eth1-endpoints` and `execution-endpoints` flags in a backwards compatible way as described in https://github.com/sigp/lighthouse/issues/3069#issuecomment-1134219221 Users have 2 options: 1. Use multiple non auth execution endpoints for deposit processing pre-merge 2. Use a single jwt authenticated execution endpoint for both execution layer and deposit processing post merge Related https://github.com/sigp/lighthouse/issues/3118 To enable jwt authenticated deposit processing, this PR removes the calls to `net_version` as the `net` namespace is not exposed in the auth server in execution clients. Moving away from using `networkId` is a good step in my opinion as it doesn't provide us with any added guarantees over `chainId`. See https://github.com/ethereum/consensus-specs/issues/2163 and https://github.com/sigp/lighthouse/issues/2115 Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-06-29 09:07:09 +00:00
Michael Sproul	47d57a290b	Improve eth1 block cache sync (for Ropsten) (#3234 ) ## Issue Addressed Fix for the eth1 cache sync issue observed on Ropsten. ## Proposed Changes Ropsten blocks are so infrequent that they broke our algorithm for downloading eth1 blocks. We currently try to download forwards from the last block in our cache to the block with block number [`remote_highest_block - FOLLOW_DISTANCE + FOLLOW_DISTANCE / ETH1_BLOCK_TIME_TOLERANCE_FACTOR`](`6f732986f1/beacon_node/eth1/src/service.rs (L489-L492)`). With the tolerance set to 4 this is insufficient because we lag by 1536 blocks, which is more like ~14 hours on Ropsten. This results in us having an incomplete eth1 cache, because we should cache all blocks between -16h and -8h. Even if we were to set the tolerance to 2 for the largest allowance, we would only look back 1024 blocks which is still more than 8 hours. For example consider this block https://ropsten.etherscan.io/block/12321390. The block from 1536 blocks earlier is 14 hours and 20 minutes before it: https://ropsten.etherscan.io/block/12319854. The block from 1024 blocks earlier is https://ropsten.etherscan.io/block/12320366, 8 hours and 48 minutes before. - This PR introduces a new CLI flag called `--eth1-cache-follow-distance` which can be used to set the distance manually. - A new dynamic catchup mechanism is added which detects when the cache is lagging the true eth1 chain and tries to download more blocks within the follow distance in order to catch up.	2022-06-03 06:05:03 +00:00
Michael Sproul	8fa032c8ae	Run fork choice before block proposal (#3168 ) ## Issue Addressed Upcoming spec change https://github.com/ethereum/consensus-specs/pull/2878 ## Proposed Changes 1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block. 2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_. 3. Remove the fork choice run from the state advance timer that occurred before advancing the state. ## Additional Info ### Block Proposal Accuracy This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue. ### Attestation Accuracy This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B. ### Why remove the call before the state advance? If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea). ### Performance Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes. ### Implementation Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.	2022-05-20 05:02:11 +00:00
Aren49	5ff4013263	Fix SPRP default value in cli (#3145 ) Changed SPRP to the correct default value of 8192.	2022-04-07 04:04:11 +00:00
realbigsean	ea783360d3	Kiln mev boost (#3062 ) ## Issue Addressed MEV boost compatibility ## Proposed Changes See #2987 ## Additional Info This is blocked on the stabilization of a couple specs, [here](https://github.com/ethereum/beacon-APIs/pull/194) and [here](https://github.com/flashbots/mev-boost/pull/20). Additional TODO's and outstanding questions - [ ] MEV boost JWT Auth - [ ] Will `builder_proposeBlindedBlock` return the revealed payload for the BN to propogate - [ ] Should we remove `private-tx-proposals` flag and communicate BN <> VC with blinded blocks by default once these endpoints enter the beacon-API's repo? This simplifies merge transition logic. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-03-31 07:52:23 +00:00
Pawan Dhananjay	381d0ece3c	auth for engine api (#3046 ) ## Issue Addressed Resolves #3015 ## Proposed Changes Add JWT token based authentication to engine api requests. The jwt secret key is read from the provided file and is used to sign tokens that are used for authenticated communication with the EL node. - [x] Interop with geth (synced `merge-devnet-4` with the `merge-kiln-v2` branch on geth) - [x] Interop with other EL clients (nethermind on `merge-devnet-4`) - [x] ~Implement `zeroize` for jwt secrets~ - [x] Add auth server tests with `mock_execution_layer` - [x] Get auth working with the `execution_engine_integration` tests Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-03-08 06:46:24 +00:00
Age Manning	f3c1dde898	Filter non global ips from discovery (#3023 ) ## Issue Addressed #3006 ## Proposed Changes This PR changes the default behaviour of lighthouse to ignore discovered IPs that are not globally routable. It adds a CLI flag, --enable-local-discovery to permit the non-global IPs in discovery. NOTE: We should take care in merging this as I will break current set-ups that rely on local IP discovery. I made this the non-default behaviour because we dont really want to be wasting resources attempting to connect to non-routable addresses and we dont want to propagate these to others (on the chance we can connect to one of these local nodes), improving discoveries efficiency.	2022-03-02 03:14:27 +00:00
Age Manning	e34524be75	Increase default target-peer count to 80 (#3005 ) Increase the default peer count from 50 to 80	2022-03-02 01:05:07 +00:00
Philipp K	5388183884	Allow per validator fee recipient via flag or file in validator client (similar to graffiti / graffiti-file) (#2924 ) ## Issue Addressed #2883 ## Proposed Changes * Added `suggested-fee-recipient` & `suggested-fee-recipient-file` flags to validator client (similar to graffiti / graffiti-file implementation). * Added proposer preparation service to VC, which sends the fee-recipient of all known validators to the BN via [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api once per slot * Added [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api endpoint and preparation data caching * Added cleanup routine to remove cached proposer preparations when not updated for 2 epochs ## Additional Info Changed the Implementation following the discussion in #2883. Co-authored-by: pk910 <philipp@pk910.de> Co-authored-by: Paul Hauner <paul@paulhauner.com> Co-authored-by: Philipp K <philipp@pk910.de>	2022-02-08 19:52:20 +00:00
Age Manning	6f4102aab6	Network performance tuning (#2608 ) There is a pretty significant tradeoff between bandwidth and speed of gossipsub messages. We can reduce our bandwidth usage considerably at the cost of minimally delaying gossipsub messages. The impact of delaying messages has not been analyzed thoroughly yet, however this PR in conjunction with some gossipsub updates show considerable bandwidth reduction. This PR allows the user to set a CLI value (`network-load`) which is an integer in the range of 1 of 5 depending on their bandwidth appetite. 1 represents the least bandwidth but slowest message recieving and 5 represents the most bandwidth and fastest received message time. For low-bandwidth users it is likely to be more efficient to use a lower value. The default is set to 3, which currently represents a reduced bandwidth usage compared to previous version of this PR. The previous lighthouse versions are equivalent to setting the `network-load` CLI to 4. This PR is awaiting a few gossipsub updates before we can get it into lighthouse.	2022-01-14 05:42:47 +00:00
Philipp K	668477872e	Allow value for beacon_node fee-recipient argument (#2884 ) ## Issue Addressed The fee-recipient argument of the beacon node does not allow a value to be specified: > $ lighthouse beacon_node --merge --fee-recipient "0x332E43696A505EF45b9319973785F837ce5267b9" > error: Found argument '0x332E43696A505EF45b9319973785F837ce5267b9' which wasn't expected, or isn't valid in this context > > USAGE: > lighthouse beacon_node --fee-recipient --merge > > For more information try --help ## Proposed Changes Allow specifying a value for the fee-recipient argument in beacon_node/src/cli.rs ## Additional Info I've added .takes_value(true) and successfully proposed a block in the kintsugi testnet with my own fee-recipient address instead of the hardcoded default. I think that was just missed as the argument does not make sense without a value :) Co-authored-by: pk910 <philipp@pk910.de> Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2022-01-07 01:21:42 +00:00
Michael Sproul	3b61ac9cbf	Optimise slasher DB layout and switch to MDBX (#2776 ) ## Issue Addressed Closes #2286 Closes #2538 Closes #2342 ## Proposed Changes Part II of major slasher optimisations after #2767 These changes will be backwards-incompatible due to the move to MDBX (and the schema change) 😱 * [x] Shrink attester keys from 16 bytes to 7 bytes. * [x] Shrink attester records from 64 bytes to 6 bytes. * [x] Separate `DiskConfig` from regular `Config`. * [x] Add configuration for the LRU cache size. * [x] Add a "migration" that deletes any legacy LMDB database.	2021-12-21 08:23:17 +00:00
Paul Hauner	1b56ebf85e	Kintsugi review comments (#2831 ) * Fix makefile * Return on invalid finalized block * Fix todo in gossip scoring * Require --merge for --fee-recipient * Bump eth2_serde_utils * Change schema versions * Swap hash/uint256 test_random impls * Use default for ExecutionPayload::empty * Check for DBs before removing * Remove kintsugi docker image * Fix CLI default value	2021-12-02 14:29:59 +11:00
Paul Hauner	afe59afacd	Ensure difficulty/hash/epoch overrides change the `ChainSpec` (#2798 ) * Unify loading of eth2_network_config * Apply overrides at lighthouse binary level * Remove duplicate override values * Add merge values to existing net configs * Make override flags global * Add merge fields to testing config * Add one to TTD * Fix failing engine tests * Fix test compile error * Remove TTD flags * Move get_eth2_network_config * Fix warn * Address review comments	2021-12-02 14:29:18 +11:00
realbigsean	de49c7ddaa	1.1.5 merge spec tests (#2781 ) * Fix arbitrary check kintsugi * Add merge chain spec fields, and a function to determine which constant to use based on the state variant * increment spec test version * Remove `Transaction` enum wrapper * Remove Transaction new-type * Remove gas validations * Add `--terminal-block-hash-epoch-override` flag * Increment spec tests version to 1.1.5 * Remove extraneous gossip verification https://github.com/ethereum/consensus-specs/pull/2687 * - Remove unused Error variants - Require both "terminal-block-hash-epoch-override" and "terminal-block-hash-override" when either flag is used * - Remove a couple more unused Error variants Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-12-02 14:26:55 +11:00
Paul Hauner	6b4cc63b57	Accept TTD override as decimal (#2676 )	2021-12-02 14:26:54 +11:00
Paul Hauner	b162b067de	Misc changes for merge testnets (#2667 ) * Thread eth1_block_hash into interop genesis state * Add merge-fork-epoch flag * Build LH with minimal spec by default * Add verbose logs to execution_layer * Add --http-allow-sync-stalled flag * Update lcli new-testnet to create genesis state * Fix http test * Fix compile errors in tests	2021-12-02 14:26:52 +11:00
Paul Hauner	d8623cfc4f	[Merge] Implement `execution_layer` (#2635 ) * Checkout serde_utils from rayonism * Make eth1::http functions pub * Add bones of execution_layer * Modify decoding * Expose Transaction, cargo fmt * Add executePayload * Add all minimal spec endpoints * Start adding json rpc wrapper * Finish custom JSON response handler * Switch to new rpc sending method * Add first test * Fix camelCase * Finish adding tests * Begin threading execution layer into BeaconChain * Fix clippy lints * Fix clippy lints * Thread execution layer into ClientBuilder * Add CLI flags * Add block processing methods to ExecutionLayer * Add block_on to execution_layer * Integrate execute_payload * Add extra_data field * Begin implementing payload handle * Send consensus valid/invalid messages * Fix minor type in task_executor * Call forkchoiceUpdated * Add search for TTD block * Thread TTD into execution layer * Allow producing block with execution payload * Add LRU cache for execution blocks * Remove duplicate 0x on ssz_types serialization * Add tests for block getter methods * Add basic block generator impl * Add is_valid_terminal_block to EL * Verify merge block in block_verification * Partially implement --terminal-block-hash-override * Add terminal_block_hash to ChainSpec * Remove Option from terminal_block_hash in EL * Revert merge changes to consensus/fork_choice * Remove commented-out code * Add bones for handling RPC methods on test server * Add first ExecutionLayer tests * Add testing for finding terminal block * Prevent infinite loops * Add insert_merge_block to block gen * Add block gen test for pos blocks * Start adding payloads to block gen * Fix clippy lints * Add execution payload to block gen * Add execute_payload to block_gen * Refactor block gen * Add all routes to mock server * Use Uint256 for base_fee_per_gas * Add working execution chain build * Remove unused var * Revert "Use Uint256 for base_fee_per_gas" This reverts commit 6c88f19ac45db834dd4dbf7a3c6e7242c1c0f735. * Fix base_fee_for_gas Uint256 * Update execute payload handle * Improve testing, fix bugs * Fix default fee-recipient * Fix fee-recipient address (again) * Add check for terminal block, add comments, tidy * Apply suggestions from code review Co-authored-by: realbigsean <seananderson33@GMAIL.com> * Fix is_none on handle Drop * Remove commented-out tests Co-authored-by: realbigsean <seananderson33@GMAIL.com>	2021-12-02 14:26:51 +11:00
Michael Sproul	df02639b71	De-duplicate attestations in the slasher (#2767 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2112 Closes https://github.com/sigp/lighthouse/issues/1861 ## Proposed Changes Collect attestations by validator index in the slasher, and use the magic of reference counting to automatically discard redundant attestations. This results in us storing only 1-2% of the attestations observed when subscribed to all subnets, which carries over to a 50-100x reduction in data stored 🎉 ## Additional Info There's some nuance to the configuration of the `slot-offset`. It has a profound effect on the effictiveness of de-duplication, see the docs added to the book for an explanation: `5442e695e5/book/src/slasher.md (slot-offset)`	2021-11-08 00:01:09 +00:00
Michael Sproul	d2e3d4c6f1	Add flag to disable lock timeouts (#2714 ) ## Issue Addressed Mitigates #1096 ## Proposed Changes Add a flag to the beacon node called `--disable-lock-timeouts` which allows opting out of lock timeouts. The lock timeouts serve a dual purpose: 1. They prevent any single operation from hogging the lock for too long. When a timeout occurs it logs a nasty error which indicates that there's suboptimal lock use occurring, which we can then act on. 2. They allow deadlock detection. We're fairly sure there are no deadlocks left in Lighthouse anymore but the timeout locks offer a safeguard against that. However, timeouts on locks are not without downsides: They allow for the possibility of livelock, particularly on slower hardware. If lock timeouts keep failing spuriously the node can be prevented from making any progress, even if it would be able to make progress slowly without the timeout. One particularly concerning scenario which could occur would be if a DoS attack succeeded in slowing block signature verification times across the network, and all Lighthouse nodes got livelocked because they timed out repeatedly. This could also occur on just a subset of nodes (e.g. dual core VPSs or Raspberri Pis). By making the behaviour runtime configurable this PR allows us to choose the behaviour we want depending on circumstance. I suspect that long term we could make the timeout-free approach the default (#2381 moves in this direction) and just enable the timeouts on our testnet nodes for debugging purposes. This PR conservatively leaves the default as-is so we can gain some more experience before switching the default.	2021-10-19 00:30:40 +00:00
Mac L	a73d698e30	Add TLS capability to the beacon node HTTP API (#2668 ) Currently, the beacon node has no ability to serve the HTTP API over TLS. Adding this functionality would be helpful for certain use cases, such as when you need a validator client to connect to a backup beacon node which is outside your local network, and the use of an SSH tunnel or reverse proxy would be inappropriate. ## Proposed Changes - Add three new CLI flags to the beacon node - `--http-enable-tls`: enables TLS - `--http-tls-cert`: to specify the path to the certificate file - `--http-tls-key`: to specify the path to the key file - Update the HTTP API to optionally use `warp`'s [`TlsServer`](https://docs.rs/warp/0.3.1/warp/struct.TlsServer.html) depending on the presence of the `--http-enable-tls` flag - Update tests and docs - Use a custom branch for `warp` to ensure proper error handling ## Additional Info Serving the API over TLS should currently be considered experimental. The reason for this is that it uses code from an [unmerged PR](https://github.com/seanmonstar/warp/pull/717). This commit provides the `try_bind_with_graceful_shutdown` method to `warp`, which is helpful for controlling error flow when the TLS configuration is invalid (cert/key files don't exist, incorrect permissions, etc). I've implemented the same code in my [branch here](https://github.com/macladson/warp/tree/tls). Once the code has been reviewed and merged upstream into `warp`, we can remove the dependency on my branch and the feature can be considered more stable. Currently, the private key file must not be password-protected in order to be read into Lighthouse.	2021-10-12 03:35:49 +00:00
Michael Sproul	9667dc2f03	Implement checkpoint sync (#2244 ) ## Issue Addressed Closes #1891 Closes #1784 ## Proposed Changes Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint. ## Additional Info - [x] Return unavailable status for out-of-range blocks requested by peers (#2561) - [x] Implement sync daemon for fetching historical blocks (#2561) - [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module) - [x] Consistency check for initial block + state - [x] Fetch the initial state and block from a beacon node HTTP endpoint - [x] Don't crash fetching beacon states by slot from the API - [x] Background service for state reconstruction, triggered by CLI flag or API call. Considered out of scope for this PR: - Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification) Co-authored-by: Diva M <divma@protonmail.com>	2021-09-22 00:37:28 +00:00
Pawan Dhananjay	b4dd98b3c6	Shutdown after sync (#2519 ) ## Issue Addressed Resolves #2033 ## Proposed Changes Adds a flag to enable shutting down beacon node right after sync is completed. ## Additional Info Will need modification after weak subjectivity sync is enabled to change definition of a fully synced node.	2021-08-30 13:46:13 +00:00
Pawan Dhananjay	d3b4cbed53	Packet filter cli option (#2523 ) ## Issue Addressed N/A ## Proposed Changes Adds a cli option to disable packet filter in `lighthouse bootnode`. This is useful in running local testnets as the bootnode bans requests from the same ip(localhost) if the packet filter is enabled.	2021-08-26 00:29:39 +00:00
Michael Sproul	b4689e20c6	Altair consensus changes and refactors (#2279 ) ## Proposed Changes Implement the consensus changes necessary for the upcoming Altair hard fork. ## Additional Info This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-07-09 06:15:32 +00:00
Pawan Dhananjay	fdaeec631b	Monitoring service api (#2251 ) ## Issue Addressed N/A ## Proposed Changes Adds a client side api for collecting system and process metrics and pushing it to a monitoring service.	2021-05-26 05:58:41 +00:00
Mac L	f6f64cf0f5	Correcting `disable-enr-auto-update` flag definition (#2303 ) ## Issue Addressed N/A ## Proposed Changes Correct the `disable-enr-auto-update` boolean flag so that it no longer requires a value. Previously it would require a value which was never used. ## Additional Info Flag is read here: https://github.com/sigp/lighthouse/blob/unstable/beacon_node/src/config.rs#L585-L587	2021-04-11 23:52:29 +00:00
Paul Hauner	2b2a358522	Detailed validator monitoring (#2151 ) ## Issue Addressed - Resolves #2064 ## Proposed Changes Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators. Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint. Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored. See the new docs included in this PR for more info. ## TODO - [x] Track validator balance, `slashed` status, etc. - [x] ~~Register slashings in current epoch, not offense epoch~~ - [ ] Publish Grafana dashboard, update TODO link in docs - [x] ~~#2130 is merged into this branch, resolve that~~	2021-01-20 19:19:38 +00:00
Michael Sproul	0c529b8d52	Add slasher broadcast (#2079 ) ## Issue Addressed Closes #2048 ## Proposed Changes * Broadcast slashings when the `--slasher-broadcast` flag is provided. * In the process of implementing this I refactored the slasher service into its own crate so that it could access the network code without creating a circular dependency. I moved the responsibility for putting slashings into the op pool into the service as well, as it makes sense for it to handle the whole slashing lifecycle.	2020-12-16 03:44:01 +00:00
Pawan Dhananjay	7933596c89	Add a purge-eth1-cache cli option (#2039 ) ## Issue Some eth1 clients are missing deposit logs on mainnet for multiple reasons (not fully synced, eth1 client issues) because of which we are getting `FailedToInsertDeposit` errors. Ideally, LH should pick up where it left off after pointing it to a nice eth1 client endpoint (which has all deposits). However, I have seen instances where LH keeps getting `FailedToInsertDeposit` even after switching to a good endpoint. Only deleting the beacon directory (which also wipes the eth1 cache) and resyncing the eth1 caches seems to be the solution. This wouldn't be great for mainnet if you have to sync your beacon node again as well. ## Proposed Changes Add a `--purge-eth1-db` option which just wipes the eth1 cache and doesn't touch the rest of the beacon db. Still need to investigate if and why LH isn't picking up where it left off for the deposit logs sync, but I think it would be good to have an option to just delete eth1 caches regardless.	2020-12-04 05:03:28 +00:00
realbigsean	fdfb81a74a	Server sent events (#1920 ) ## Issue Addressed Resolves #1434 (this is the last major feature in the standard spec. There are only a couple of places we may be off-spec due to recent spec changes or ongoing discussion) Partly addresses #1669 ## Proposed Changes - remove the websocket server - remove the `TeeEventHandler` and `NullEventHandler` - add server sent events according to the eth2 API spec ## Additional Info This is according to the currently unmerged PR here: https://github.com/ethereum/eth2.0-APIs/pull/117 Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-04 00:18:58 +00:00
Age Manning	c718e81eaf	Add privacy option (#2016 ) Adds a `--privacy` CLI flag to the beacon node that users may opt into. This does two things: - Removes client identifying information from the identify libp2p protocol - Changes the default graffiti to "" if no graffiti is set.	2020-11-30 22:55:08 +00:00
blacktemplar	38b15deccb	Fallback nodes for eth1 access (#1918 ) ## Issue Addressed part of #1883 ## Proposed Changes Adds a new cli argument `--eth1-endpoints` that can be used instead of `--eth1-endpoint` to specify a comma-separated list of endpoints. If the first endpoint returns an error for some request the other endpoints are tried in the given order. ## Additional Info Currently if the first endpoint fails the fallbacks are used silently (except for `try_fallback_test_endpoint` that is used in `do_update` which logs a `WARN` for each endpoint that is not reachable). A question is if we should add more logs so that the user gets warned if his main endpoint is for example just slow and sometimes hits timeouts.	2020-11-27 08:37:44 +00:00
Michael Sproul	5828ff1204	Implement slasher (#1567 ) This is an implementation of a slasher that lives inside the BN and can be enabled via `lighthouse bn --slasher`. Features included in this PR: - [x] Detection of attester slashing conditions (double votes, surrounds existing, surrounded by existing) - [x] Integration into Lighthouse's attestation verification flow - [x] Detection of proposer slashing conditions - [x] Extraction of attestations from blocks as they are verified - [x] Compression of chunks - [x] Configurable history length - [x] Pruning of old attestations and blocks - [x] More tests Future work: * Focus on a slice of history separate from the most recent N epochs (e.g. epochs `current - K` to `current - M`) * Run out-of-process * Ingest attestations from the chain without a resync Design notes are here https://hackmd.io/@sproul/HJSEklmPL	2020-11-23 03:43:22 +00:00
Paul Hauner	65b1cf2af1	Add flag to import all attestations (#1941 ) ## Issue Addressed NA ## Proposed Changes Adds the `--import-all-attestations` flag which tells the `network::AttestationService` to import/aggregate all attestations after verification (instead of only ones for subnets that are relevant to local validators). This is useful for testing/debugging and also for creating back-up nodes that should be all cached up and ready for any validator. ## Additional Info NA	2020-11-22 23:58:25 +00:00
Paul Hauner	bcc7f6b143	Add new flag to set blocks per eth1 query (#1931 ) ## Issue Addressed NA ## Proposed Changes Users on Discord (and @protolambda) have experienced this error (or variants of it): ``` Failed to update eth1 cache: GetDepositLogsFailed("Eth1 node returned error: {\"code\":-32005,\"message\":\"query returned more than 10000 results\"}") ``` This PR allows users to reduce the span of blocks searched for deposit logs and therefore reduce the size of the return result. Hopefully experimentation with this flag can lead to finding a better default value. ## Additional Info NA	2020-11-18 22:18:59 +00:00
Michael Sproul	a60ab4eff2	Refine compaction (#1916 ) ## Proposed Changes In an attempt to fix OOM issues and database consistency issues observed by some users after the introduction of compaction in v0.3.4, this PR makes the following changes: * Run compaction less often: roughly every 1024 epochs, including after long periods of non-finality. I think the division check proposed by Paul is pretty solid, and ensures we don't miss any events where we should be compacting. LevelDB lacks an easy way to check the size of the DB, which would be another good trigger. * Make it possible to disable the compaction on finalization using `--auto-compact-db=false` * Make it possible to trigger a manual, single-threaded foreground compaction on start-up using `--compact-db` * Downgrade the pruning log to `DEBUG`, as it's particularly noisy during sync I would like to ship these changes to affected users ASAP, and will document them further in the Advanced Database section of the book if they prove effective.	2020-11-17 09:10:53 +00:00
Age Manning	c00e6c2c6f	Small network adjustments (#1884 ) ## Issue Addressed - Asymmetric pings - Currently with symmetric ping intervals, lighthouse nodes race each other to ping often ending in simultaneous ping connections. This shifts the ping interval to be asymmetric based on inbound/outbound connections - Correct inbound/outbound peer-db registering - It appears we were accounting inbound as outbound and vice versa in the peerdb, this has been corrected - Improved logging There is likely more to come - I'll leave this open as we investigate further testnets	2020-11-13 06:06:33 +00:00
Age Manning	64c5899d25	Adds colour help to bn and vc subcommands (#1811 ) Adds coloured help to the bn and vc subcommands	2020-10-23 04:16:34 +00:00

1 2

94 Commits