lighthouse

Author	SHA1	Message	Date
realbigsean	81d078bfc7	remove strict fee recipient docs (#3551 ) ## Issue Addressed Related: #3550 Remove references to the `--strict-fee-recipient` flag in docs. The flag will cause missed proposals prior to the merge transition. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-08 00:06:25 +00:00
Alexander Cyon	419c53bf24	Add flag 'log-color' preserving color of log redirected to file. (#3538 ) Add flag 'log-color' which preserves colors of log when stdout is redirected to a file. This is my first lighthouse PR, please let me know if I'm not following contribution guidelines, I welcome meta-feeback (feedback on git commit messages, git branch naming, and the contents of the description of this PR.) ## Issue Addressed Solves https://github.com/sigp/lighthouse/issues/3527 ## Proposed Changes Adding a flag which enables log color preserving when stdout is redirected to a file. ### Usage Below I demonstrate current behaviour (without using the new flag) and the new behaviur (when using new flag). In the screenshot below, I have to panes, one on top running `lighthouse` which redirects to file `~/Desktop/test.log` and one pane in the bottom which runs `tail -f ~/Desktop/test.log`. #### Current behaviour ```sh lighthouse --network prater vc \|& tee -a ~/Desktop/test.log ``` Result is no colors <img width="1624" alt="current" src="https://user-images.githubusercontent.com/864410/188258226-bfcf8271-4c9e-474c-848e-ac92a60df25c.png"> #### New behaviour ```sh lighthouse --network prater vc --log-color \|& tee -a ~/Desktop/test.log ``` Result is colors 🔴🟢🔵🟡 <img width="1624" alt="new" src="https://user-images.githubusercontent.com/864410/188258223-7d9ecf09-92c8-4cba-8f24-bd4d88fc0353.png"> ## Additional Info I chose American spelling of "color" instead of Brittish "colour' since that was aligned with `slog`'s API - method`force_color()`, let me know if you prefer spelling "colour" instead. I also chose to let it be an arg not taking any argument, just like `logfile-compress` flag, rather than having to write `--log-color true`.	2022-09-06 05:58:27 +00:00
ZZ	528e150e53	Update graffiti.md (#3537 ) fix typo	2022-09-05 08:29:02 +00:00
Michael Sproul	9a7f7f1c1e	Configurable monitoring endpoint frequency (#3530 ) ## Issue Addressed Closes #3514 ## Proposed Changes - Change default monitoring endpoint frequency to 120 seconds to fit with 30k requests/month limit. - Allow configuration of the monitoring endpoint frequency using `--monitoring-endpoint-frequency N` where `N` is a value in seconds.	2022-09-05 08:29:00 +00:00
realbigsean	177aef8f1e	Builder profit threshold flag (#3534 ) ## Issue Addressed Resolves https://github.com/sigp/lighthouse/issues/3517 ## Proposed Changes Adds a `--builder-profit-threshold <wei value>` flag to the BN. If an external payload's value field is less than this value, the local payload will be used. The value of the local payload will not be checked (it can't really be checked until the engine API is updated to support this). Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-05 04:50:49 +00:00
omahs	95c56630a6	Fixing a few typos / documentation (#3531 ) Fixing a few typos in the documentation	2022-09-05 04:50:48 +00:00
realbigsean	cae40731a2	Strict count unrealized (#3522 ) ## Issue Addressed Add a flag that can increase count unrealized strictness, defaults to false ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: sean <seananderson33@gmail.com>	2022-09-05 04:50:47 +00:00
MaboroshiChan	f13dd04f42	Add timeout for --checkpoint-sync-url (#3521 ) ## Issue Addressed [Have --checkpoint-sync-url timeout](https://github.com/sigp/lighthouse/issues/3478) ## Proposed Changes I added a parameter for `get_bytes_opt_accept_header<U: IntoUrl>` which accept a timeout duration, and modified the body of `get_beacon_blocks_ssz` and `get_debug_beacon_states_ssz` to pass corresponding timeout durations.	2022-09-05 04:50:46 +00:00
Mac L	80359d8ddb	Fix attestation performance API `InvalidValidatorIndex` error (#3503 ) ## Issue Addressed When requesting an index which is not active during `start_epoch`, Lighthouse returns: ``` curl "http://localhost:5052/lighthouse/analysis/attestation_performance/999999999?start_epoch=100000&end_epoch=100000" ``` ```json { "code": 500, "message": "INTERNAL_SERVER_ERROR: ParticipationCache(InvalidValidatorIndex(999999999))", "stacktraces": [] } ``` This error occurs even when the index in question becomes active before `end_epoch` which is undesirable as it can prevent larger queries from completing. ## Proposed Changes In the event the index is out-of-bounds (has not yet been activated), simply return all fields as `false`: ``` -> curl "http://localhost:5052/lighthouse/analysis/attestation_performance/999999999?start_epoch=100000&end_epoch=100000" ``` ```json [ { "index": 999999999, "epochs": { "100000": { "active": false, "head": false, "target": false, "source": false } } } ] ``` By doing this, we cover the case where a validator becomes active sometime between `start_epoch` and `end_epoch`. ## Additional Info Note that this error only occurs for epochs after the Altair hard fork.	2022-09-05 04:50:45 +00:00
Divma	473abc14ca	Subscribe to subnets only when needed (#3419 ) ## Issue Addressed We currently subscribe to attestation subnets as soon as the subscription arrives (one epoch in advance), this makes it so that subscriptions for future slots are scheduled instead of done immediately. ## Proposed Changes - Schedule subscriptions to subnets for future slots. - Finish removing hashmap_delay, in favor of [delay_map](https://github.com/AgeManning/delay_map). This was the only remaining service to do this. - Subscriptions for past slots are rejected, before we would subscribe for one slot. - Add a new test for subscriptions that are not consecutive. ## Additional Info This is also an effort in making the code easier to understand	2022-09-05 00:22:48 +00:00
Paul Hauner	aa022f4685	v3.1.0 (#3525 ) ## Issue Addressed NA ## Proposed Changes - Bump versions ## Additional Info - ~~Blocked on #3508~~ - ~~Blocked on #3526~~ - ~~Requires additional testing.~~ - Expected release date is 2022-09-01	2022-08-31 22:21:55 +00:00
Pawan Dhananjay	c5785887a9	Log fee recipients in VC (#3526 ) ## Issue Addressed Resolves #3524 ## Proposed Changes Log fee recipient in the `Validator exists in beacon chain` log. Logging in the BN already happens here `18c61a5e8b/beacon_node/beacon_chain/src/beacon_chain.rs (L3858-L3865)` I also think it's good practice to encourage users to set the fee recipient in the VC rather than the BN because of issues mentioned here https://github.com/sigp/lighthouse/issues/3432 Some example logs from prater: ``` Aug 30 03:47:09.922 INFO Validator exists in beacon chain fee_recipient: 0xab97_ad88, validator_index: 213615, pubkey: 0xb542b69ba14ddbaf717ca1762ece63a4804c08d38a1aadf156ae718d1545942e86763a1604f5065d4faa550b7259d651, service: duties Aug 30 03:48:05.505 INFO Validator exists in beacon chain fee_recipient: Fee recipient for validator not set in validator_definitions.yml or provided with the `--suggested-fee-recipient flag`, validator_index: 210710, pubkey: 0xad5d67cc7f990590c7b3fa41d593c4cf12d9ead894be2311fbb3e5c733d8c1b909e9d47af60ea3480fb6b37946c35390, service: duties ``` Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-08-30 05:47:32 +00:00
Paul Hauner	661307dce1	Separate committee subscriptions queue (#3508 ) ## Issue Addressed NA ## Proposed Changes As we've seen on Prater, there seems to be a correlation between these messages ``` WARN Not enough time for a discovery search subnet_id: ExactSubnet { subnet_id: SubnetId(19), slot: Slot(3742336) }, service: attestation_service ``` ... and nodes falling 20-30 slots behind the head for short periods. These nodes are running ~20k Prater validators. After running some metrics, I can see that the `network_recv` channel is processing ~250k `AttestationSubscribe` messages per minute. It occurred to me that perhaps the `AttestationSubscribe` messages are "washing out" the `SendRequest` and `SendResponse` messages. In this PR I separate the `AttestationSubscribe` and `SyncCommitteeSubscribe` messages into their own queue so the `tokio::select!` in the `NetworkService` can still process the other messages in the `network_recv` channel without necessarily having to clear all the subscription messages first. ~~I've also added filter to the HTTP API to prevent duplicate subscriptions going to the network service.~~ ## Additional Info - Currently being tested on Prater	2022-08-30 05:47:31 +00:00
realbigsean	ebd0e0e2d9	Docker builds in GitHub actions (#3523 ) ## Issue Addressed I think the antithesis is failing due to an OOM which may be resolved by updating the ubuntu image it runs on. The lcli build looks like it's failing because the image lacks the `libclang` dependency Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-29 18:31:27 +00:00
Michael Sproul	7a50684741	Harden slot notifier against clock drift (#3519 ) ## Issue Addressed Partly resolves #3518 ## Proposed Changes Change the slot notifier to use `duration_to_next_slot` rather than an interval timer. This makes it robust against underlying clock changes.	2022-08-29 14:34:43 +00:00
Paul Hauner	1a833ecc17	Add more logging for invalid payloads (#3515 ) ## Issue Addressed NA ## Proposed Changes Adds more `debug` logging to help troubleshoot invalid execution payload blocks. I was doing some of this recently and found it to be challenging. With this PR we should be able to grep `Invalid execution payload` and get one-liners that will show the block, slot and details about the proposer. I also changed the log in `process_invalid_execution_payload` since it was a little misleading; the `block_root` wasn't necessary the block which had an invalid payload. ## Additional Info NA	2022-08-29 14:34:42 +00:00
Paul Hauner	8609cced0e	Reset payload statuses when resuming fork choice (#3498 ) ## Issue Addressed NA ## Proposed Changes This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot. The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around. ## Additional Info This implementation has the following intricacies: 1. Instead of just resetting invalid payloads to optimistic, we'll also reset valid payloads. This is an artifact of our existing implementation. 1. We will only reset payload statuses when we detect an invalid payload present in `proto_array` - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks. 1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that does not have reverted payload statuses. - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.	2022-08-29 14:34:41 +00:00
realbigsean	2ce86a0830	Validator registration request failures do not cause us to mark BNs offline (#3488 ) ## Issue Addressed Relates to https://github.com/sigp/lighthouse/issues/3416 ## Proposed Changes - Add an `OfflineOnFailure` enum to the `first_success` method for querying beacon nodes so that a val registration request failure from the BN -> builder does not result in the BN being marked offline. This seems important because these failures could be coming directly from a connected relay and actually have no bearing on BN health. Other messages that are sent to a relay have a local fallback so shouldn't result in errors - Downgrade the following log to a `WARN` ``` ERRO Unable to publish validator registrations to the builder network, error: All endpoints failed https://BN_B => RequestFailed(ServerMessage(ErrorMessage { code: 500, message: "UNHANDLED_ERROR: BuilderMissing", stacktraces: [] })), https://XXXX/ => Unavailable(Offline), [omitted] ``` ## Additional Info I think this change at least improves the UX of having a VC connected to some builder and some non-builder beacon nodes. I think we need to balance potentially alerting users that there is a BN <> VC misconfiguration and also allowing this type of fallback to work. If we want to fully support this type of configuration we may want to consider adding a flag `--builder-beacon-nodes` and track whether a VC should be making builder queries on a per-beacon node basis. But I think the changes in this PR are independent of that type of extension. PS: Sorry for the big diff here, it's mostly formatting changes after I added a new arg to a bunch of methods calls. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-29 11:35:59 +00:00
Michael Sproul	66eca1a882	Refactor op pool for speed and correctness (#3312 ) ## Proposed Changes This PR has two aims: to speed up attestation packing in the op pool, and to fix bugs in the verification of attester slashings, proposer slashings and voluntary exits. The changes are bundled into a single database schema upgrade (v12). Attestation packing is sped up by removing several inefficiencies: - No more recalculation of `attesting_indices` during packing. - No (unnecessary) examination of the `ParticipationFlags`: a bitfield suffices. See `RewardCache`. - No re-checking of attestation validity during packing: the `AttestationMap` provides attestations which are "correct by construction" (I have checked this using Hydra). - No SSZ re-serialization for the clunky `AttestationId` type (it can be removed in a future release). So far the speed-up seems to be roughly 2-10x, from 500ms down to 50-100ms. Verification of attester slashings, proposer slashings and voluntary exits is fixed by: - Tracking the `ForkVersion`s that were used to verify each message inside the `SigVerifiedOp`. This allows us to quickly re-verify that they match the head state's opinion of what the `ForkVersion` should be at the epoch(s) relevant to the message. - Storing the `SigVerifiedOp` on disk rather than the raw operation. This allows us to continue track the fork versions after a reboot. This is mostly contained in this commit 52bb1840ae5c4356a8fc3a51e5df23ed65ed2c7f. ## Additional Info The schema upgrade uses the justified state to re-verify attestations and compute `attesting_indices` for them. It will drop any attestations that fail to verify, by the logic that attestations are most valuable in the few slots after they're observed, and are probably stale and useless by the time a node restarts. Exits and proposer slashings and similarly re-verified to obtain `SigVerifiedOp`s. This PR contains a runtime killswitch `--paranoid-block-proposal` which opts out of all the optimisations in favour of closely verifying every included message. Although I'm quite sure that the optimisations are correct this flag could be useful in the event of an unforeseen emergency. Finally, you might notice that the `RewardCache` appears quite useless in its current form because it is only updated on the hot-path immediately before proposal. My hope is that in future we can shift calls to `RewardCache::update` into the background, e.g. while performing the state advance. It is also forward-looking to `tree-states` compatibility, where iterating and indexing `state.{previous,current}_epoch_participation` is expensive and needs to be minimised.	2022-08-29 09:10:26 +00:00
Michael Sproul	1c9ec42dcb	More merge doc updates (#3509 ) ## Proposed Changes Address a few shortcomings of the book noticed by users: - Remove description of redundant execution nodes - Use an Infura eth1 node rather than an eth2 node in the merge migration example - Add an example of the fee recipient address format (we support addresses without the 0x prefix, but 0x prefixed feels more canonical). - Clarify that Windows support is no longer beta - Add a link to the MSRV to the build-from-source instructions	2022-08-26 21:47:50 +00:00
Paul Hauner	c64e17bb81	Return `readonly: false` for local keystores (#3490 ) ## Issue Addressed NA ## Proposed Changes Indicate that local keystores are `readonly: Some(false)` rather than `None` via the `/eth/v1/keystores` method on the VC API. I'll mark this as backwards-incompat so we remember to mention it in the release notes. There aren't any type-level incompatibilities here, just a change in how Lighthouse responds to responses. ## Additional Info - Blocked on #3464	2022-08-24 23:35:00 +00:00
Paul Hauner	ebd661783e	Enable `block_lookup_failed` EF test (#3489 ) ## Issue Addressed Resolves #3448 ## Proposed Changes Removes a known failure that wasn't actually a known failure. The tests declare this block invalid and we refuse to import it due to `ExecutionPayloadError(UnverifiedNonOptimisticCandidate)`. This is correct since there is only one "eth1" block included in this test and two are required to trigger the merge (pre- and post-TTD blocks). It is slot 1 (tick = 12s) when this block is imported so the import must be prevented by `SAFE_SLOTS_TO_IMPORT_OPTIMISTICALLY`. I'm not sure where I got the idea in #3448 that this test needed retrospective checking, that seems like a false assumption in hindsight. ## Additional Info - Blocked on #3464	2022-08-24 23:34:59 +00:00
realbigsean	cb132c622d	don't register exited or slashed validators with the builder api (#3473 ) ## Issue Addressed #3465 ## Proposed Changes Filter out any validator registrations for validators that are not `active` or `pending`. I'm adding this filtering the beacon node because all the information is readily available there. In other parts of the VC we are usually sending per-validator requests based on duties from the BN. And duties will only be provided for active validators so we don't have this type of filtering elsewhere in the VC. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-24 23:34:58 +00:00
Divma	8c69d57c2c	Pause sync when EE is offline (#3428 ) ## Issue Addressed #3032 ## Proposed Changes Pause sync when ee is offline. Changes include three main parts: - Online/offline notification system - Pause sync - Resume sync #### Online/offline notification system - The engine state is now guarded behind a new struct `State` that ensures every change is correctly notified. Notifications are only sent if the state changes. The new `State` is behind a `RwLock` (as before) as the synchronization mechanism. - The actual notification channel is a [tokio::sync::watch](https://docs.rs/tokio/latest/tokio/sync/watch/index.html) which ensures only the last value is in the receiver channel. This way we don't need to worry about message order etc. - Sync waits for state changes concurrently with normal messages. #### Pause Sync Sync has four components, pausing is done differently in each: - Block lookups: Disabled while in this state. We drop current requests and don't search for new blocks. Block lookups are infrequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it. - Parent lookups: Disabled while in this state. We drop current requests and don't search for new parents. Parent lookups are even less frequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it. - Range: Chains don't send batches for processing to the beacon processor. This is easily done by guarding the channel to the beacon processor and giving it access only if the ee is responsive. I find this the simplest and most powerful approach since we don't need to deal with new sync states and chain segments that are added while the ee is offline will follow the same logic without needing to synchronize a shared state among those. Another advantage of passive pause vs active pause is that we can still keep track of active advertised chain segments so that on resume we don't need to re-evaluate all our peers. - Backfill: Not affected by ee states, we don't pause. #### Resume Sync - Block lookups: Enabled again. - Parent lookups: Enabled again. - Range: Active resume. Since the only real pause range does is not sending batches for processing, resume makes all chains that are holding read-for-processing batches send them. - Backfill: Not affected by ee states, no need to resume. ## Additional Info QUESTION: Originally I made this to notify and change on synced state, but @pawanjay176 on talks with @paulhauner concluded we only need to check online/offline states. The upcheck function mentions extra checks to have a very up to date sync status to aid the networking stack. However, the only need the networking stack would have is this one. I added a TODO to review if the extra check can be removed Next gen of #3094 Will work best with #3439 Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-08-24 23:34:56 +00:00
Michael Sproul	aab4a8d2f2	Update docs for mainnet merge release (#3494 ) ## Proposed Changes Update the merge migration docs to encourage updating mainnet configs _now_! The docs are also updated to recommend _against_ `--suggested-fee-recipient` on the beacon node (https://github.com/sigp/lighthouse/issues/3432). Additionally the `--help` for the CLI is updated to match with a few small semantic changes: - `--execution-jwt` is no longer allowed without `--execution-endpoint`. We've ended up without a default for `--execution-endpoint`, so I think that's fine. - The flags related to the JWT are only allowed if `--execution-jwt` is provided.	2022-08-23 03:50:58 +00:00
Paul Hauner	18c61a5e8b	v3.0.0 (#3464 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v3.0.0 ## Additional Info - ~~Blocked on #3439~~ - ~~Blocked on #3459~~ - ~~Blocked on #3463~~ - ~~Blocked on #3462~~ - ~~Requires further testing~~ Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2022-08-22 03:43:08 +00:00
Paul Hauner	931153885c	Run per-slot fork choice at a further distance from the head (#3487 ) ## Issue Addressed NA ## Proposed Changes Run fork choice when the head is 256 slots from the wall-clock slot, rather than 4. The reason we don't always run FC is so that it doesn't slow us down during sync. As the comments state, setting the value to 256 means that we'd only have one interrupting fork-choice call if we were syncing at 20 slots/sec. ## Additional Info NA	2022-08-19 04:27:24 +00:00
Paul Hauner	df358b864d	Add metrics for EE `PayloadStatus` returns (#3486 ) ## Issue Addressed NA ## Proposed Changes Adds some metrics so we can track payload status responses from the EE. I think this will be useful for troubleshooting and alerting. I also bumped the `BecaonChain::per_slot_task` to `debug` since it doesn't seem too noisy and would have helped us with some things we were debugging in the past. ## Additional Info NA	2022-08-19 04:27:23 +00:00
Paul Hauner	043fa2153e	Revise EE peer penalites (#3485 ) ## Issue Addressed NA ## Proposed Changes Don't penalize peers for errors that might be caused by an honest optimistic node. ## Additional Info NA	2022-08-19 04:27:22 +00:00
Paul Hauner	a0605c4ee6	Bump EF tests to `v1.2.0 rc.3` (#3483 ) ## Issue Addressed NA ## Proposed Changes Bumps test vectors and ignores another weird MacOS file. ## Additional Info NA	2022-08-19 04:27:21 +00:00
Mac L	726d1b0d9b	Unblock CI by updating git submodules directly in execution integration tests (#3479 ) ## Issue Addressed Recent changes to the Nethermind codebase removed the `rocksdb` git submodule in favour of a `nuget` package. This appears to have broken our ability to build the latest release of Nethermind inside our integration tests. ## Proposed Changes ~Temporarily pin the version used for the Nethermind integration tests to `master`. This ensures we use the packaged version of `rocksdb`. This is only necessary until a new release of Nethermind is available.~ Use `git submodule update --init --recursive` to ensure the required submodules are pulled before building. Co-authored-by: Diva M <divma@protonmail.com>	2022-08-19 04:27:20 +00:00
Michael Sproul	c2604c47d6	Optimistic sync: remove justified block check (#3477 ) ## Issue Addressed Implements spec change https://github.com/ethereum/consensus-specs/pull/2881 ## Proposed Changes Remove the justified block check from `is_optimistic_candidate_block`.	2022-08-17 02:36:41 +00:00
Paul Hauner	7664776fc4	Add test for exits spanning epochs (#3476 ) ## Issue Addressed NA ## Proposed Changes Adds a test that was written whilst doing some testing. This PR does not make changes to production code, it just adds a test for already existing functionality. ## Additional Info NA	2022-08-17 02:36:40 +00:00
Michael Sproul	8255c8682e	Align engine API timeouts with spec (#3470 ) ## Proposed Changes Match the timeouts from the `execution-apis` spec. Our existing values were already quite close so I don't imagine this change to be very disruptive. The spec sets the timeout for `engine_getPayloadV1` to only 1 second, but we were already using a longer value of 2 seconds. I've kept the 2 second timeout as I don't think there's any need to fail faster when producing a payload. There's no timeout specified for `eth_syncing` so I've matched it to the shortest timeout from the spec (1 second). I think the previous value of 250ms was likely too low and could have been contributing to spurious timeouts, particularly for remote ELs. ## Additional Info The timeouts are defined on each endpoint in this document: https://github.com/ethereum/execution-apis/blob/main/src/engine/specification.md	2022-08-17 02:36:39 +00:00
Paul Hauner	d9d1288156	Add mainnet merge values 🐼 (#3462 ) ## Issue Addressed NA ## Proposed Changes Adds tentative values for the merge TTD and Bellatrix as per https://github.com/ethereum/consensus-specs/pull/2969 ## Additional Info - ~~Blocked on https://github.com/ethereum/consensus-specs/pull/2969~~	2022-08-17 02:36:38 +00:00
Michael Sproul	e5fc9f26bc	Log if no execution endpoint is configured (#3467 ) ## Issue Addressed Fixes an issue whereby syncing a post-merge network without an execution endpoint would silently stall. Sync swallows the errors from block verification so previously there was no indication in the logs for why the node couldn't sync. ## Proposed Changes Add an error log to the merge-readiness notifier for the case where the merge has already completed but no execution endpoint is configured.	2022-08-15 01:31:02 +00:00
Michael Sproul	25e3dc9300	Fix block verification and checkpoint sync caches (#3466 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2962 ## Proposed Changes Build all caches on the checkpoint state before storing it in the database. Additionally, fix a bug in `signature_verify_chain_segment` which prevented block verification from succeeding unless the previous epoch cache was already built. The previous epoch cache is required to verify the signatures of attestations included from previous epochs, even when all the blocks in the segment are from the same epoch. The comments around `signature_verify_chain_segment` have also been updated to reflect the fact that it should only be used on a chain of blocks from a single epoch. I believe this restriction had already been added at some point in the past and that the current comments were just outdated (and I think because the proposer shuffling can change in the next epoch based on the blocks applied in the current epoch that this limitation is essential).	2022-08-15 01:31:00 +00:00
Paul Hauner	f03f9ba680	Increase merge-readiness lookhead (#3463 ) ## Issue Addressed NA ## Proposed Changes Start issuing merge-readiness logs 2 weeks before the Bellatrix fork epoch. Additionally, if the Bellatrix epoch is specified and the use has configured an EL, always log merge readiness logs, this should benefit pro-active users. ### Lookahead Reasoning - Bellatrix fork is: - epoch 144896 - slot 4636672 - Unix timestamp: `1606824023 + (4636672 * 12) = 1662464087` - GMT: Tue Sep 06 2022 11:34:47 GMT+0000 - Warning start time is: - Unix timestamp: `1662464087 - 604800 * 2 = 1661254487` - GMT: Tue Aug 23 2022 11:34:47 GMT+0000 The [current expectation](https://discord.com/channels/595666850260713488/745077610685661265/1007445305198911569) is that EL and CL clients will releases out by Aug 22nd at the latest, then an EF announcement will go out on the 23rd. If all goes well, LH will start alerting users about merge-readiness just after the announcement. ## Additional Info NA	2022-08-15 01:30:59 +00:00
realbigsean	dd93aa8701	Standard gas limit api (#3450 ) ## Issue Addressed Resolves https://github.com/sigp/lighthouse/issues/3403 ## Proposed Changes Implements https://ethereum.github.io/keymanager-APIs/#/Gas%20Limit ## Additional Info N/A Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-15 01:30:58 +00:00
Michael Sproul	92d597ad23	Modularise slasher backend (#3443 ) ## Proposed Changes Enable multiple database backends for the slasher, either MDBX (default) or LMDB. The backend can be selected using `--slasher-backend={lmdb,mdbx}`. ## Additional Info In order to abstract over the two library's different handling of database lifetimes I've used `Box::leak` to give the `Environment` type a `'static` lifetime. This was the only way I could think of using 100% safe code to construct a self-referential struct `SlasherDB`, where the `OpenDatabases` refers to the `Environment`. I think this is OK, as the `Environment` is expected to live for the life of the program, and both database engines leave the database in a consistent state after each write. The memory claimed for memory-mapping will be freed by the OS and appropriately flushed regardless of whether the `Environment` is actually dropped. We are depending on two `sigp` forks of `libmdbx-rs` and `lmdb-rs`, to give us greater control over MDBX OS support and LMDB's version.	2022-08-15 01:30:56 +00:00
Pawan Dhananjay	71fd0b42f2	Fix lints for Rust 1.63 (#3459 ) ## Issue Addressed N/A ## Proposed Changes Fix clippy lints for latest rust version 1.63. I have allowed the [derive_partial_eq_without_eq](https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq) lint as satisfying this lint would result in more code that we might not want and I feel it's not required. Happy to fix this lint across lighthouse if required though.	2022-08-12 00:56:39 +00:00
Divma	f4ffa9e0b4	Handle processing results of non faulty batches (#3439 ) ## Issue Addressed Solves #3390 So after checking some logs @pawanjay176 got, we conclude that this happened because we blacklisted a chain after trying it "too much". Now here, in all occurrences it seems that "too much" means we got too many download failures. This happened very slowly, exactly because the batch is allowed to stay alive for very long times after not counting penalties when the ee is offline. The error here then was not that the batch failed because of offline ee errors, but that we blacklisted a chain because of download errors, which we can't pin on the chain but on the peer. This PR fixes that. ## Proposed Changes Adds a missing piece of logic so that if a chain fails for errors that can't be attributed to an objectively bad behavior from the peer, it is not blacklisted. The issue at hand occurred when new peers arrived claiming a head that had wrongfully blacklisted, even if the original peers participating in the chain were not penalized. Another notable change is that we need to consider a batch invalid if it processed correctly but its next non empty batch fails processing. Now since a batch can fail processing in non empty ways, there is no need to mark as invalid previous batches. Improves some logging as well. ## Additional Info We should do this regardless of pausing sync on ee offline/unsynced state. This is because I think it's almost impossible to ensure a processing result will reach in a predictable order with a synced notification from the ee. Doing this handles what I think are inevitable data races when we actually pause sync This also fixes a return that reports which batch failed and caused us some confusion checking the logs	2022-08-12 00:56:38 +00:00
realbigsean	a476ae4907	Linkcheck fix (#3452 ) ## Issue Addressed I think we're running into this in our linkcheck, so I'm going to frist verify linkcheck fails on the current version, and then try downgrading it to see if it passes https://github.com/chronotope/chrono/issues/755 Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-11 10:08:36 +00:00
Alex Wied	e0f86588e6	lighthouse_version: Fix version string regex (#3451 ) ## Issue Addressed N/A ## Proposed Changes If the build tree is not a git repository, the unit test will fail. This PR fixes the issue. ## Additional Info N/A	2022-08-11 07:50:32 +00:00
Paul Hauner	4fc0cb121c	Remove some "wontfix" TODOs for the merge (#3449 ) ## Issue Addressed NA ## Proposed Changes Removes three types of TODOs: 1. `execution_layer/src/lib.rs`: It was [determined](https://github.com/ethereum/consensus-specs/issues/2636#issuecomment-988688742) that there is no action required here. 2. `beacon_processor/worker/gossip_methods.rs`: Removed TODOs relating to peer scoring that have already been addressed via `epe.penalize_peer()`. - It seems `cargo fmt` wanted to adjust some things here as well 🤷 3. `proto_array_fork_choice.rs`: it would be nice to remove that useless `bool` for cleanliness, but I don't think it's something we need to do and the TODO just makes things look messier IMO. ## Additional Info There should be no functional changes to the code in this PR. There are still some TODOs lingering, those ones require actual changes or more thought.	2022-08-10 13:06:46 +00:00
Michael Sproul	4e05f19fb5	Serve Bellatrix preset in BN API (#3425 ) ## Issue Addressed Resolves #3388 Resolves #2638 ## Proposed Changes - Return the `BellatrixPreset` on `/eth/v1/config/spec` by default. - Allow users to opt out of this by providing `--http-spec-fork=altair` (unless there's a Bellatrix fork epoch set). - Add the Altair constants from #2638 and make serving the constants non-optional (the `http-disable-legacy-spec` flag is deprecated). - Modify the VC to only read the `Config` and not to log extra fields. This prevents it from having to muck around parsing the `ConfigAndPreset` fields it doesn't need. ## Additional Info This change is backwards-compatible for the VC and the BN, but is marked as a breaking change for the removal of `--http-disable-legacy-spec`. I tried making `Config` a `superstruct` too, but getting the automatic decoding to work was a huge pain and was going to require a lot of hacks, so I gave up in favour of keeping the default-based approach we have now.	2022-08-10 07:52:59 +00:00
Pawan Dhananjay	c25934956b	Remove INVALID_TERMINAL_BLOCK (#3385 ) ## Issue Addressed Resolves #3379 ## Proposed Changes Remove instances of `InvalidTerminalBlock` in lighthouse and use `Invalid {latest_valid_hash: "0x0000000000000000000000000000000000000000000000000000000000000000"}` to represent that status.	2022-08-10 07:52:58 +00:00
Paul Hauner	2de26b20f8	Don't return errors on HTTP API for already-known messages (#3341 ) ## Issue Addressed - Resolves #3266 ## Proposed Changes Return 200 OK rather than an error when a block, attestation or sync message is already known. Presently, we will log return an error which causes a BN to go "offline" from the VCs perspective which causes the fallback mechanism to do work to try and avoid and upcheck offline nodes. This can be observed as instability in the `vc_beacon_nodes_available_count` metric. The current behaviour also causes scary logs for the user. There's nothing to actually be concerned about when we see duplicate messages, this can happen on fallback systems (see code comments). ## Additional Info NA	2022-08-10 07:52:57 +00:00
Brendan Timmons	052d5cf31f	fix: incorrectly formatted MEV link in Lighthouse Book (#3434 ) ## Issue Addressed N/A ## Proposed Changes Simply fix the incorrect formatting on markdown link. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-08-09 06:05:17 +00:00
realbigsean	6f13727fbe	Don't use the builder network if the head is optimistic (#3412 ) ## Issue Addressed Resolves https://github.com/sigp/lighthouse/issues/3394 Adds a check in `is_healthy` about whether the head is optimistic when choosing whether to use the builder network. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-09 06:05:16 +00:00

1 2 3 4 5 ...

4725 Commits