lighthouse

Author	SHA1	Message	Date
Jimmy Chen	6ec0ce6070	Implement get validator block endpoint for EIP-4844	2023-03-13 16:50:08 +11:00
Diva M	203a9e5f5e	Merge branch 'unstable' into eip4844	2023-03-10 11:19:56 -05:00
Paul Hauner	3ad77fb52c	Add VC metric for primary BN latency (#4051 ) ## Issue Addressed NA ## Proposed Changes In #4024 we added metrics to expose the latency measurements from a VC to each BN. Whilst playing with these new metrics on our infra I realised it would be great to have a single metric to make sure that the primary BN for each VC has a reasonable latency. With the current "metrics for all BNs" it's hard to tell which is the primary. ## Additional Info NA	2023-03-06 04:08:49 +00:00
Paul Hauner	f775404c10	Log a `WARN` in the VC for a mismatched Capella fork epoch (#4050 ) ## Issue Addressed NA ## Proposed Changes - Adds a `WARN` statement for Capella, just like the previous forks. - Adds a hint message to all those WARNs to suggest the user update the BN or VC. ## Additional Info NA	2023-03-06 04:08:48 +00:00
Michael Sproul	57dfcfd83a	Optimise attestation selection proof signing (#4033 ) ## Issue Addressed Closes #3963 (hopefully) ## Proposed Changes Compute attestation selection proofs gradually each slot rather than in a single `join_all` at the start of each epoch. On a machine with 5k validators this replaces 5k tasks signing 5k proofs with 1 task that signs 5k/32 ~= 160 proofs each slot. Based on testing with Goerli validators this seems to reduce the average time to produce a signature by preventing Tokio and the OS from falling over each other trying to run hundreds of threads. My testing so far has been with local keystores, which run on a dynamic pool of up to 512 OS threads because they use [`spawn_blocking`](https://docs.rs/tokio/1.11.0/tokio/task/fn.spawn_blocking.html) (and we haven't changed the default). An earlier version of this PR hyper-optimised the time-per-signature metric to the detriment of the entire system's performance (see the reverted commits). The current PR is conservative in that it avoids touching the attestation service at all. I think there's more optimising to do here, but we can come back for that in a future PR rather than expanding the scope of this one. The new algorithm for attestation selection proofs is: - We sign a small batch of selection proofs each slot, for slots up to 8 slots in the future. On average we'll sign one slot's worth of proofs per slot, with an 8 slot lookahead. - The batch is signed halfway through the slot when there is unlikely to be contention for signature production (blocks are <4s, attestations are ~4-6 seconds, aggregates are 8s+). ## Performance Data _See first comment for updated graphs_. Graph of median signing times before this PR: ![signing_times_median](https://user-images.githubusercontent.com/4452260/221495627-3ab3c105-319f-406e-b99d-b5913e0ded9c.png) Graph of update attesters metric (includes selection proof signing) before this PR: ![update_attesters_store](https://user-images.githubusercontent.com/4452260/221497057-01ba40e4-8148-45f6-9e21-36a9567a631a.png) Median signing time after this PR (prototype from 12:00, updated version from 13:30): ![signing_times_median_updated](https://user-images.githubusercontent.com/4452260/221771578-47a040cc-b832-482f-9a1a-d1bd9854e00e.png) 99th percentile on signing times (bounded attestation signing from 16:55, now removed): ![signing_times_99pc](https://user-images.githubusercontent.com/4452260/221772055-e64081a8-2220-45ba-ba6d-9d7e344a5bde.png) Attester map update timing after this PR: ![update_attesters_store_updated](https://user-images.githubusercontent.com/4452260/221771757-c8558a48-7f4e-4bb5-9929-dee177a66c1e.png) Selection proof signings per second change: ![signing_attempts](https://user-images.githubusercontent.com/4452260/221771855-64f5da22-1655-478d-926b-810be8a3650c.png) ## Link to late blocks I believe this is related to the slow block signings because logs from Stakely in #3963 show these two logs almost 5 seconds apart: > Feb 23 18:56:23.978 INFO Received unsigned block, slot: 5862880, service: block, module: validator_client::block_service:393 > Feb 23 18:56:28.552 INFO Publishing signed block, slot: 5862880, service: block, module: validator_client::block_service:416 The only thing that happens between those two logs is the signing of the block: `0fb58a680d/validator_client/src/block_service.rs (L410-L414)` Helpfully, Stakely noticed this issue without any Lighthouse BNs in the mix, which pointed to a clear issue in the VC. ## TODO - [x] Further testing on testnet infrastructure. - [x] Make the attestation signing parallelism configurable.	2023-03-05 23:43:31 +00:00
Paul Hauner	6e15533b54	Add latency measurement service to VC (#4024 ) ## Issue Addressed NA ## Proposed Changes Adds a service which periodically polls (11s into each mainnet slot) the `node/version` endpoint on each BN and roughly measures the round-trip latency. The latency is exposed as a `DEBG` log and a Prometheus metric. The `--latency-measurement-service` has been added to the VC, with the following options: - `--latency-measurement-service true`: enable the service (default). - `--latency-measurement-service`: (without a value) has the same effect. - `--latency-measurement-service false`: disable the service. ## Additional Info Whilst looking at our staking setup, I think the BN+VC latency is contributing to late blocks. Now that we have to wait for the builders to respond it's nice to try and do everything we can to reduce that latency. Having visibility is the first step.	2023-03-05 23:43:29 +00:00
Diva M	f16e82ab2c	Merge branch 'unstable' into eip4844	2023-03-03 14:14:18 -05:00
Atanas Minkov	2b6348781a	Log a debug message when a request fails for a beacon node candidate (#4036 ) ## Issue Addressed #3985 ## Proposed Changes Log a debug message when a BN candidate returns an error. `Mar 01 16:40:24.011 DEBG Request to beacon node failed error: ServerMessage(ErrorMessage { code: 503, message: "SERVICE_UNAVAILABLE: beacon node is syncing: head slot is 8416, current slot is 5098402", stacktraces: [] }), node: http://localhost:5052/`	2023-03-02 05:26:14 +00:00
Divma	047c7544e3	Clean capella (#4019 ) ## Issue Addressed Cleans up all the remnants of 4844 in capella. This makes sure when 4844 is reviewed there is nothing we are missing because it got included here ## Proposed Changes drop a bomb on every 4844 thing ## Additional Info Merge process I did (locally) is as follows: - squash merge to produce one commit - in new branch off unstable with the squashed commit create a `git revert HEAD` commit - merge that new branch onto 4844 with `--strategy ours` - compare local 4844 to remote 4844 and make sure the diff is empty - enjoy Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-03-01 03:19:02 +00:00
Alan Höng	cc4fc422b2	Add content-type header to metrics server response (#3970 ) This fixes issues with certain metrics scrapers, which might error if the content-type is not correctly set. ## Issue Addressed Fixes https://github.com/sigp/lighthouse/issues/3437 ## Proposed Changes Simply set header: `Content-Type: text/plain` on metrics server response. Seems like the errored branch does this correctly already. ## Additional Info This is needed also to enable influx-db metric scraping which work very nicely with Geth.	2023-02-28 02:20:50 +00:00
Michael Sproul	066c27750a	Merge remote-tracking branch 'origin/staging' into capella-update	2023-02-17 12:05:36 +11:00
Michael Sproul	2fcfdf1a01	Fix docker and deps (#3978 ) ## Proposed Changes - Fix this cargo-audit failure for `sqlite3-sys`: https://github.com/sigp/lighthouse/actions/runs/4179008889/jobs/7238473962 - Prevent the Docker builds from running out of RAM on CI by removing `gnosis` and LMDB support from the `-dev` images (see: https://github.com/sigp/lighthouse/pull/3959#issuecomment-1430531155, successful run on my fork: https://github.com/michaelsproul/lighthouse/actions/runs/4179162480/jobs/7239537947).	2023-02-15 11:51:46 +00:00
Michael Sproul	18c8cab4da	Merge remote-tracking branch 'origin/unstable' into capella-merge	2023-02-14 12:07:27 +11:00
Pawan Dhananjay	2b735a9e8b	Add attestation duty slot metric (#2704 ) ## Issue Addressed Resolves #2521 ## Proposed Changes Add a metric that indicates the next attestation duty slot for all managed validators in the validator client.	2023-02-09 23:51:17 +00:00
Michael Sproul	bb0e99c097	Merge remote-tracking branch 'origin/unstable' into capella	2023-01-21 10:37:26 +11:00
GeemoCandama	b4d9fc03ee	add logging for starting request and receiving block (#3858 ) ## Issue Addressed #3853 ## Proposed Changes Added `INFO` level logs for requesting and receiving the unsigned block. ## Additional Info Logging for successfully publishing the signed block is already there. And seemingly there is a log for when "We realize we are going to produce a block" in the `start_update_service`: `info!(log, "Block production service started"); `. Is there anywhere else you'd like to see logging around this event? Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>	2023-01-17 05:13:48 +00:00
David Theodore	9a970ce3a2	add better err reporting UnableToOpenVotingKeystore (#3781 ) ## Issue Addressed #3780 ## Proposed Changes Add error reporting that notifies the node operator that the `voting_keystore_path` in their `validator_definitions.yml` file may be incorrect. ## Additional Info There is more info in issue #3780 Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-01-17 05:13:47 +00:00
Michael Sproul	2af8110529	Merge remote-tracking branch 'origin/unstable' into capella Fixing the conflicts involved patching up some of the `block_hash` verification, the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870	2023-01-12 16:22:00 +11:00
realbigsean	c85cd87c10	Web3 signer validator definitions reloading on any request (#3801 ) ## Issue Addressed https://github.com/sigp/lighthouse/issues/3795 Co-authored-by: realbigsean <sean@sigmaprime.io>	2023-01-09 08:18:56 +00:00
realbigsean	d8f7277beb	cleanup	2022-12-30 11:00:14 -05:00
Mark Mackey	c188cde034	merge upstream/unstable	2022-12-28 14:43:25 -06:00
Mark Mackey	c922566fbc	Fixed Some Tests	2022-12-27 15:59:34 -06:00
Divma	ffbf70e2d9	Clippy lints for rust 1.66 (#3810 ) ## Issue Addressed Fixes the new clippy lints for rust 1.66 ## Proposed Changes Most of the changes come from: - [unnecessary_cast](https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast) - [iter_kv_map](https://rust-lang.github.io/rust-clippy/master/index.html#iter_kv_map) - [needless_borrow](https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow) ## Additional Info na	2022-12-16 04:04:00 +00:00
Michael Sproul	991e4094f8	Merge remote-tracking branch 'origin/unstable' into capella-update	2022-12-14 13:00:41 +11:00
Michael Sproul	775d222299	Enable proposer boost re-orging (#2860 ) ## Proposed Changes With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004 There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>	2022-12-13 09:57:26 +00:00
Mac L	979b73c9b6	Add API endpoint to get VC graffiti (#3779 ) ## Issue Addressed #3766 ## Proposed Changes Adds an endpoint to get the graffiti that will be used for the next block proposal for each validator. ## Usage ```bash curl -H "Authorization: Bearer api-token" http://localhost:9095/lighthouse/ui/graffiti \| jq ``` ```json { "data": { "0x81283b7a20e1ca460ebd9bbd77005d557370cabb1f9a44f530c4c4c66230f675f8df8b4c2818851aa7d77a80ca5a4a5e": "mr f was here", "0xa3a32b0f8b4ddb83f1a0a853d81dd725dfe577d4f4c3db8ece52ce2b026eca84815c1a7e8e92a4de3d755733bf7e4a9b": "mr v was here", "0x872c61b4a7f8510ec809e5b023f5fdda2105d024c470ddbbeca4bc74e8280af0d178d749853e8f6a841083ac1b4db98f": null } } ``` ## Additional Info This will only return graffiti that the validator client knows about. That is from these 3 sources: 1. Graffiti File 2. validator_definitions.yml 3. The `--graffiti` flag on the VC If the graffiti is set on the BN, it will not be returned. This may warrant an additional endpoint on the BN side which can be used in the event the endpoint returns `null`.	2022-12-09 09:20:13 +00:00
ethDreamer	1a39976715	Fixed Compiler Warnings & Failing Tests (#3771 )	2022-12-03 10:42:12 +11:00
Mark Mackey	8a04c3428e	Merged with `unstable`	2022-11-30 17:29:10 -06:00
Age Manning	230168deff	Health Endpoints for UI (#3668 ) This PR adds some health endpoints for the beacon node and the validator client. Specifically it adds the endpoint: `/lighthouse/ui/health` These are not entirely stable yet. But provide a base for modification for our UI. These also may have issues with various platforms and may need modification.	2022-11-15 05:21:26 +00:00
tim gretler	5dba89e43b	Sync committee sign bn fallback (#3624 ) ## Issue Addressed Closes #3612 ## Proposed Changes - Iterates through BNs until it finds a non-optimistic head. A slight change in error behavior: - Previously: `spawn_contribution_tasks` did not return an error for a non-optimistic block head. It returned `Ok(())` logged a warning. - Now: `spawn_contribution_tasks` returns an error if it cannot find a non-optimistic block head. The caller of `spawn_contribution_tasks` then logs the error as a critical error. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-11-13 22:40:43 +00:00
realbigsean	1aec17b09c	Merge branch 'unstable' of https://github.com/sigp/lighthouse into eip4844	2022-11-04 13:23:55 -04:00
Divma	8600645f65	Fix rust 1.65 lints (#3682 ) ## Issue Addressed New lints for rust 1.65 ## Proposed Changes Notable change is the identification or parameters that are only used in recursion ## Additional Info na	2022-11-04 07:43:43 +00:00
realbigsean	d8a49aad2b	merge with unstable fixes	2022-11-01 13:26:56 -04:00
realbigsean	8656d23327	merge with unstable	2022-11-01 13:18:00 -04:00
Pawan Dhananjay	29f2ec46d3	Couple blocks and blobs in gossip (#3670 ) * Revert "Add more gossip verification conditions" This reverts commit `1430b561c3`. * Revert "Add todos" This reverts commit `91efb9d4c7`. * Revert "Reprocess blob sidecar messages" This reverts commit `21bf3d37cd`. * Add the coupled topic * Decode SignedBeaconBlockAndBlobsSidecar correctly * Process Block and Blobs in beacon processor * Remove extra blob publishing logic from vc * Remove blob signing in vc * Ugly hack to compile	2022-11-01 10:28:21 -04:00
ethDreamer	e8604757a2	Deposit Cache Finalization & Fast WS Sync (#2915 ) ## Summary The deposit cache now has the ability to finalize deposits. This will cause it to drop unneeded deposit logs and hashes in the deposit Merkle tree that are no longer required to construct deposit proofs. The cache is finalized whenever the latest finalized checkpoint has a new `Eth1Data` with all deposits imported. This has three benefits: 1. Improves the speed of constructing Merkle proofs for deposits as we can just replay deposits since the last finalized checkpoint instead of all historical deposits when re-constructing the Merkle tree. 2. Significantly faster weak subjectivity sync as the deposit cache can be transferred to the newly syncing node in compressed form. The Merkle tree that stores `N` finalized deposits requires a maximum of `log2(N)` hashes. The newly syncing node then only needs to download deposits since the last finalized checkpoint to have a full tree. 3. Future proofing in preparation for [EIP-4444](https://eips.ethereum.org/EIPS/eip-4444) as execution nodes will no longer be required to store logs permanently so we won't always have all historical logs available to us. ## More Details Image to illustrate how the deposit contract merkle tree evolves and finalizes along with the resulting `DepositTreeSnapshot` ![image](https://user-images.githubusercontent.com/37123614/151465302-5fc56284-8a69-4998-b20e-45db3934ac70.png) ## Other Considerations I've changed the structure of the `SszDepositCache` so once you load & save your database from this version of lighthouse, you will no longer be able to load it from older versions. Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>	2022-10-30 04:04:24 +00:00
realbigsean	137f230344	Capella eip 4844 cleanup (#3652 ) * add capella gossip boiler plate * get everything compiling Co-authored-by: realbigsean <sean@sigmaprime.io Co-authored-by: Mark Mackey <mark@sigmaprime.io> * small cleanup * small cleanup * cargo fix + some test cleanup * improve block production * add fixme for potential panic Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-10-26 15:15:26 -04:00
realbigsean	b5b4ce9509	blob production	2022-10-05 17:14:45 -04:00
realbigsean	88006735c4	compile	2022-10-03 10:06:04 -04:00
realbigsean	fe6fc55449	fix compilation errors, rename capella -> shanghai, cleanup some rebase issues	2022-09-29 12:43:13 -04:00
realbigsean	4cdf1b546d	add shanghai fork version and epoch	2022-09-29 12:28:58 -04:00
realbigsean	de44b300c0	add/update types	2022-09-29 12:25:56 -04:00
Pawan Dhananjay	6779912fe4	Publish subscriptions to all beacon nodes (#3529 ) ## Issue Addressed Resolves #3516 ## Proposed Changes Adds a beacon fallback function for running a beacon node http query on all available fallbacks instead of returning on a first successful result. Uses the new `run_on_all` method for attestation and sync committee subscriptions. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-09-28 19:53:35 +00:00
Divma	bd873e7162	New rust lints for rustc 1.64.0 (#3602 ) ## Issue Addressed fixes lints from the last rust release ## Proposed Changes Fix the lints, most of the lints by `clippy::question-mark` are false positives in the form of https://github.com/rust-lang/rust-clippy/issues/9518 so it's allowed for now ## Additional Info	2022-09-23 03:52:46 +00:00
Daniel Knopik	ca1e17b386	it compiles!	2022-09-17 12:23:03 +02:00
realbigsean	a9f075c3c0	Remove strict fee recipient (#3552 ) ## Issue Addressed Resolves: #3550 Remove the `--strict-fee-recipient` flag. It will cause missed proposals prior to the bellatrix transition. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-08 23:46:02 +00:00
Michael Sproul	9a7f7f1c1e	Configurable monitoring endpoint frequency (#3530 ) ## Issue Addressed Closes #3514 ## Proposed Changes - Change default monitoring endpoint frequency to 120 seconds to fit with 30k requests/month limit. - Allow configuration of the monitoring endpoint frequency using `--monitoring-endpoint-frequency N` where `N` is a value in seconds.	2022-09-05 08:29:00 +00:00
MaboroshiChan	f13dd04f42	Add timeout for --checkpoint-sync-url (#3521 ) ## Issue Addressed [Have --checkpoint-sync-url timeout](https://github.com/sigp/lighthouse/issues/3478) ## Proposed Changes I added a parameter for `get_bytes_opt_accept_header<U: IntoUrl>` which accept a timeout duration, and modified the body of `get_beacon_blocks_ssz` and `get_debug_beacon_states_ssz` to pass corresponding timeout durations.	2022-09-05 04:50:46 +00:00
Pawan Dhananjay	c5785887a9	Log fee recipients in VC (#3526 ) ## Issue Addressed Resolves #3524 ## Proposed Changes Log fee recipient in the `Validator exists in beacon chain` log. Logging in the BN already happens here `18c61a5e8b/beacon_node/beacon_chain/src/beacon_chain.rs (L3858-L3865)` I also think it's good practice to encourage users to set the fee recipient in the VC rather than the BN because of issues mentioned here https://github.com/sigp/lighthouse/issues/3432 Some example logs from prater: ``` Aug 30 03:47:09.922 INFO Validator exists in beacon chain fee_recipient: 0xab97_ad88, validator_index: 213615, pubkey: 0xb542b69ba14ddbaf717ca1762ece63a4804c08d38a1aadf156ae718d1545942e86763a1604f5065d4faa550b7259d651, service: duties Aug 30 03:48:05.505 INFO Validator exists in beacon chain fee_recipient: Fee recipient for validator not set in validator_definitions.yml or provided with the `--suggested-fee-recipient flag`, validator_index: 210710, pubkey: 0xad5d67cc7f990590c7b3fa41d593c4cf12d9ead894be2311fbb3e5c733d8c1b909e9d47af60ea3480fb6b37946c35390, service: duties ``` Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-08-30 05:47:32 +00:00
realbigsean	2ce86a0830	Validator registration request failures do not cause us to mark BNs offline (#3488 ) ## Issue Addressed Relates to https://github.com/sigp/lighthouse/issues/3416 ## Proposed Changes - Add an `OfflineOnFailure` enum to the `first_success` method for querying beacon nodes so that a val registration request failure from the BN -> builder does not result in the BN being marked offline. This seems important because these failures could be coming directly from a connected relay and actually have no bearing on BN health. Other messages that are sent to a relay have a local fallback so shouldn't result in errors - Downgrade the following log to a `WARN` ``` ERRO Unable to publish validator registrations to the builder network, error: All endpoints failed https://BN_B => RequestFailed(ServerMessage(ErrorMessage { code: 500, message: "UNHANDLED_ERROR: BuilderMissing", stacktraces: [] })), https://XXXX/ => Unavailable(Offline), [omitted] ``` ## Additional Info I think this change at least improves the UX of having a VC connected to some builder and some non-builder beacon nodes. I think we need to balance potentially alerting users that there is a BN <> VC misconfiguration and also allowing this type of fallback to work. If we want to fully support this type of configuration we may want to consider adding a flag `--builder-beacon-nodes` and track whether a VC should be making builder queries on a per-beacon node basis. But I think the changes in this PR are independent of that type of extension. PS: Sorry for the big diff here, it's mostly formatting changes after I added a new arg to a bunch of methods calls. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-29 11:35:59 +00:00

1 2 3 4 5 ...

462 Commits