lighthouse

Author	SHA1	Message	Date
Paul Hauner	59e45fe349	Reduce verbosity of reprocess queue logs (#4101 ) ## Issue Addressed NA ## Proposed Changes Replaces #4058 to attempt to reduce `ERRO Failed to send scheduled attestation` spam and provide more information for diagnosis. With this PR we achieve: - When dequeuing attestations after a block is received, send only one log which reports `n` failures (rather than `n` logs reporting `n` failures). - Make a distinction in logs between two separate attestation dequeuing events. - Add more information to both log events to help assist with troubleshooting. ## Additional Info NA	2023-03-21 05:15:00 +00:00
ethDreamer	65a5eb8292	Reconstruct Payloads using Payload Bodies Methods (#4028 ) ## Issue Addressed * #3895 Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2023-03-19 23:15:59 +00:00
Michael Sproul	4c2d4af6cd	Make more noise when the EL is broken (#3986 ) ## Issue Addressed Closes #3814, replaces #3818. ## Proposed Changes * Add a WARN log for the case where we are attempting to sync chain segments but can't process them because they're building on an invalid parent. The most common case where we see this is when the execution node database is corrupt, causing sync to stall mysteriously (because we're currently logging the failure only at debug level). * Additionally I've bumped up the logging for invalid execution payloads to `WARN`. This may result in some duplicate logs as we log errors from the `beacon_chain` and then again from the beacon processor. Invalid payloads and corrupt DBs _should_ be rare enough that this doesn't produce overwhelming log volume.	2023-03-17 00:44:02 +00:00
Age Manning	3d99ce25f8	Correct a race condition when dialing peers (#4056 ) There is a race condition which occurs when multiple discovery queries return at almost the exact same time and they independently contain a useful peer we would like to connect to. The condition can occur that we can add the same peer to the dial queue, before we get a chance to process the queue. This ends up displaying an error to the user: ``` ERRO Dialing an already dialing peer ``` Although this error is harmless it's not ideal. There are two solutions to resolving this: 1. As we decide to dial the peer, we change the state in the peer-db to dialing (before we add it to the queue) which would prevent other requests from adding to the queue. 2. We prevent duplicates in the dial queue This PR has opted for 2. because 1. will complicate the code in that we are changing states in non-intuitive places. Although this technically adds a very slight performance cost, its probably a cleaner solution as we can keep the state-changing logic in one place.	2023-03-16 05:44:54 +00:00
Daniel Ramirez Chiquillo	1ec3041673	Remove Router/Processor Code (#4002 ) ## Issue Addressed #3938 ## Proposed Changes - `network::Processor` is deleted and all it's logic is moved to `network::Router`. - The `network::Router` module is moved to a single file. - The following functions are deleted: `on_disconnect` `send_status` `on_status_response` `on_blocks_by_root_request` `on_lightclient_bootstrap` `on_blocks_by_range_request` `on_block_gossip` `on_unaggregated_attestation_gossip` `on_aggregated_attestation_gossip` `on_voluntary_exit_gossip` `on_proposer_slashing_gossip` `on_attester_slashing_gossip` `on_sync_committee_signature_gossip` `on_sync_committee_contribution_gossip` `on_light_client_finality_update_gossip` `on_light_client_optimistic_update_gossip`. This deletions are possible because the updated `Router` allows the underlying methods to be called directly.	2023-03-15 01:27:47 +00:00
Divma	e190ebb8a0	Support for Ipv6 (#4046 ) ## Issue Addressed Add support for ipv6 and dual stack in lighthouse. ## Proposed Changes From an user perspective, now setting an ipv6 address, optionally configuring the ports should feel exactly the same as using an ipv4 address. If listening over both ipv4 and ipv6 then the user needs to: - use the `--listen-address` two times (ipv4 and ipv6 addresses) - `--port6` becomes then required - `--discovery-port6` can now be used to additionally configure the ipv6 udp port ### Rough list of code changes - Discovery: - Table filter and ip mode set to match the listening config. - Ipv6 address, tcp port and udp port set in the ENR builder - Reported addresses now check which tcp port to give to libp2p - LH Network Service: - Can listen over Ipv6, Ipv4, or both. This uses two sockets. Using mapped addresses is disabled from libp2p and it's the most compatible option. - NetworkGlobals: - No longer stores udp port since was not used at all. Instead, stores the Ipv4 and Ipv6 TCP ports. - NetworkConfig: - Update names to make it clear that previous udp and tcp ports in ENR were Ipv4 - Add fields to configure Ipv6 udp and tcp ports in the ENR - Include advertised enr Ipv6 address. - Add type to model Listening address that's either Ipv4, Ipv6 or both. A listening address includes the ip, udp port and tcp port. - UPnP: - Kept only for ipv4 - Cli flags: - `--listen-addresses` now can take up to two values - `--port` will apply to ipv4 or ipv6 if only one listening address is given. If two listening addresses are given it will apply only to Ipv4. - `--port6` New flag required when listening over ipv4 and ipv6 that applies exclusively to Ipv6. - `--discovery-port` will now apply to ipv4 and ipv6 if only one listening address is given. - `--discovery-port6` New flag to configure the individual udp port of ipv6 if listening over both ipv4 and ipv6. - `--enr-udp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour. - `--enr-udp6-port` Added to configure the enr udp6 field. - `--enr-tcp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour. - `--enr-tcp6-port` Added to configure the enr tcp6 field. - `--enr-addresses` now can take two values. - `--enr-match` updated behaviour. - Common: - rename `unused_port` functions to specify that they are over ipv4. - add functions to get unused ports over ipv6. - Testing binaries - Updated code to reflect network config changes and unused_port changes. ## Additional Info TODOs: - use two sockets in discovery. I'll get back to this and it's on https://github.com/sigp/discv5/pull/160 - lcli allow listening over two sockets in generate_bootnodes_enr - add at least one smoke flag for ipv6 (I have tested this and works for me) - update the book	2023-03-14 01:13:34 +00:00
Pawan Dhananjay	5b18fd92cb	Cleaner logic for gossip subscriptions for new forks (#4030 ) ## Issue Addressed Cleaner resolution for #4006 ## Proposed Changes We are currently subscribing to core topics of new forks way before the actual fork since we had just a single `CORE_TOPICS` array. This PR separates the core topics for every fork and subscribes to only required topics based on the current fork. Also adds logic for subscribing to the core topics of a new fork only 2 slots before the fork happens. 2 slots is to give enough time for the gossip meshes to form. Currently doesn't add logic to remove topics from older forks in new forks. For e.g. in the coupled 4844 world, we had to remove the `BeaconBlock` topic in favour of `BeaconBlocksAndBlobsSidecar` at the 4844 fork. It should be easy enough to add though. Not adding it because I'm assuming that #4019 will get merged before this PR and we won't require any deletion logic. Happy to add it regardless though.	2023-03-01 09:22:48 +00:00
Divma	047c7544e3	Clean capella (#4019 ) ## Issue Addressed Cleans up all the remnants of 4844 in capella. This makes sure when 4844 is reviewed there is nothing we are missing because it got included here ## Proposed Changes drop a bomb on every 4844 thing ## Additional Info Merge process I did (locally) is as follows: - squash merge to produce one commit - in new branch off unstable with the squashed commit create a `git revert HEAD` commit - merge that new branch onto 4844 with `--strategy ours` - compare local 4844 to remote 4844 and make sure the diff is empty - enjoy Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-03-01 03:19:02 +00:00
Paul Hauner	9c81be8ac4	Fix metric (#4020 )	2023-02-22 09:46:45 +11:00
Michael Sproul	066c27750a	Merge remote-tracking branch 'origin/staging' into capella-update	2023-02-17 12:05:36 +11:00
Divma	ffeb8b6e05	blacklist tests in windows (#3961 ) ## Issue Addressed Windows tests for subscription and unsubscriptions fail in CI sporadically. We usually ignore this failures, so this PR aims to help reduce the failure noise. Associated issue is https://github.com/sigp/lighthouse/issues/3960	2023-02-16 23:34:30 +00:00
Michael Sproul	18c8cab4da	Merge remote-tracking branch 'origin/unstable' into capella-merge	2023-02-14 12:07:27 +11:00
Paul Hauner	84843d67d7	Reduce some EE and builder related ERRO logs to WARN (#3966 ) ## Issue Addressed NA ## Proposed Changes Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf". The modified logs are: #### `ERRO Execution engine call failed` I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues. #### `ERRO "Builder failed to reveal payload"` In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting. I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise. #### `ERRO "Relay error when registering validator(s)"` It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it. #### `ERRO Error fetching block for peer ExecutionLayerErrorPayloadReconstruction` This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers. ## Additional Info NA	2023-02-12 23:14:08 +00:00
Paul Hauner	e062a7cf76	Broadcast address changes at Capella (#3919 ) * Add first efforts at broadcast * Tidy * Move broadcast code to client * Progress with broadcast impl * Rename to address change * Fix compile errors * Use `while` loop * Tidy * Flip broadcast condition * Switch to forgetting individual indices * Always broadcast when the node starts * Refactor into two functions * Add testing * Add another test * Tidy, add more testing * Tidy * Add test, rename enum * Rename enum again * Tidy * Break loop early * Add V15 schema migration * Bump schema version * Progress with migration * Update beacon_node/client/src/address_change_broadcast.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Fix typo in function name --------- Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-07 17:13:49 +11:00
Michael Sproul	c76a1971cc	Merge remote-tracking branch 'origin/unstable' into capella	2023-01-25 14:20:16 +11:00
GeemoCandama	a7351c00c0	light client optimistic update reprocessing (#3799 ) ## Issue Addressed Currently there is a race between receiving blocks and receiving light client optimistic updates (in unstable), which results in processing errors. This is a continuation of PR #3693 and seeks to progress on issue #3651 ## Proposed Changes Add the parent_root to ReprocessQueueMessage::BlockImported so we can remove blocks from queue when a block arrives that has the same parent root. We use the parent root as opposed to the block_root because the LightClientOptimisticUpdate does not contain the block_root. If light_client_optimistic_update.attested_header.canonical_root() != head_block.message().parent_root() then we queue the update. Otherwise we process immediately. ## Additional Info michaelsproul came up with this idea. The code was heavily based off of the attestation reprocessing. I have not properly tested this to see if it works as intended.	2023-01-24 22:17:50 +00:00
Michael Sproul	d8abf2fc41	Import BLS to execution changes before Capella (#3892 ) * Import BLS to execution changes before Capella * Test for BLS to execution change HTTP API * Pack BLS to execution changes in LIFO order * Remove unused var * Clippy	2023-01-21 10:39:59 +11:00
Michael Sproul	bb0e99c097	Merge remote-tracking branch 'origin/unstable' into capella	2023-01-21 10:37:26 +11:00
Age Manning	f8a3b3b95a	Improve block delay metrics (#3894 ) We recently ran a large-block experiment on the testnet and plan to do a further experiment on mainnet. Although the metrics recovered from lighthouse nodes were quite useful, I think we could do with greater resolution in the block delay metrics and get some specific values for each block (currently these can be lost to large exponential histogram buckets). This PR increases the resolution of the block delay histogram buckets, but also introduces a new metric which records the last block delay. Depending on the polling resolution of the metric server, we can lose some block delay information, however it will always give us a specific value and we will not lose exact data based on poor resolution histogram buckets.	2023-01-20 00:46:56 +00:00
realbigsean	1319683736	Update gossip_methods.rs	2023-01-13 14:59:03 -05:00
Mark Mackey	05c1291d8a	Don't Penalize Early `bls_to_execution_change`	2023-01-13 12:53:25 -06:00
Michael Sproul	2af8110529	Merge remote-tracking branch 'origin/unstable' into capella Fixing the conflicts involved patching up some of the `block_hash` verification, the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870	2023-01-12 16:22:00 +11:00
Paul Hauner	830efdb5c2	Improve validator monitor experience for high validator counts (#3728 ) ## Issue Addressed NA ## Proposed Changes Myself and others (#3678) have observed that when running with lots of validators (e.g., 1000s) the cardinality is too much for Prometheus. I've seen Prometheus instances just grind to a halt when we turn the validator monitor on for our testnet validators (we have 10,000s of Goerli validators). Additionally, the debug log volume can get very high with one log per validator, per attestation. To address this, the `bn --validator-monitor-individual-tracking-threshold <INTEGER>` flag has been added to disable per-validator (i.e., non-aggregated) metrics/logging once the validator monitor exceeds the threshold of validators. The default value is `64`, which is a finger-to-the-wind value. I don't actually know the value at which Prometheus starts to become overwhelmed, but I've seen it work with ~64 validators and I've seen it not work with 1000s of validators. A default of `64` seems like it will result in a breaking change to users who are running millions of dollars worth of validators whilst resulting in a no-op for low-validator-count users. I'm open to changing this number, though. Additionally, this PR starts collecting aggregated Prometheus metrics (e.g., total count of head hits across all validators), so that high-validator-count validators still have some interesting metrics. We already had logging for aggregated values, so nothing has been added there. I've opted to make this a breaking change since it can be rather damaging to your Prometheus instance to accidentally enable the validator monitor with large numbers of validators. I've crashed a Prometheus instance myself and had a report from another user who's done the same thing. ## Additional Info NA ## Breaking Changes Note A new label has been added to the validator monitor Prometheus metrics: `total`. This label tracks the aggregated metrics of all validators in the validator monitor (as opposed to each validator being tracking individually using its pubkey as the label). Additionally, a new flag has been added to the Beacon Node: `--validator-monitor-individual-tracking-threshold`. The default value is `64`, which means that when the validator monitor is tracking more than 64 validators then it will stop tracking per-validator metrics and only track the `all_validators` metric. It will also stop logging per-validator logs and only emit aggregated logs (the exception being that exit and slashing logs are always emitted). These changes were introduced in #3728 to address issues with untenable Prometheus cardinality and log volume when using the validator monitor with high validator counts (e.g., 1000s of validators). Users with less than 65 validators will see no change in behavior (apart from the added `all_validators` metric). Users with more than 65 validators who wish to maintain the previous behavior can set something like `--validator-monitor-individual-tracking-threshold 999999`.	2023-01-09 08:18:55 +00:00
Michael Sproul	4bd2b777ec	Verify execution block hashes during finalized sync (#3794 ) ## Issue Addressed Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible. ## Proposed Changes I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client. I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library). Block hash verification is pretty quick, about 500us per block on my machine (mainnet). The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL. ## Additional Info This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it. I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.	2023-01-09 03:11:59 +00:00
realbigsean	d8f7277beb	cleanup	2022-12-30 11:00:14 -05:00
Mark Mackey	3e90fb8cae	Merge branch 'unstable' into capella	2022-12-15 12:20:03 -06:00
Divma	63c74b37f4	send error answering bbrange requests when an error occurrs (#3800 ) ## Issue Addressed While testing withdrawals with @ethDreamer we noticed lighthouse is sending empty batches when an error occurs. As LH peer receiving this, we would consider this a low tolerance action because the peer is claiming the batch is right and is empty. ## Proposed Changes If any kind of error occurs, send a error response instead ## Additional Info Right now we don't handle such thing as a partial batch with an error. If an error is received, the whole batch is discarded. Because of this it makes little sense to send partial batches that end with an error, so it's better to do the proposed solution instead of sending empty batches.	2022-12-15 00:16:38 +00:00
Michael Sproul	991e4094f8	Merge remote-tracking branch 'origin/unstable' into capella-update	2022-12-14 13:00:41 +11:00
GeemoCandama	1b28ef8a8d	Adding light_client gossip topics (#3693 ) ## Issue Addressed Implementing the light_client_gossip topics but I'm not there yet. Which issue # does this PR address? Partially #3651 ## Proposed Changes Add light client gossip topics. Please list or describe the changes introduced by this PR. I'm going to Implement light_client_finality_update and light_client_optimistic_update gossip topics. Currently I've attempted the former and I'm seeking feedback. ## Additional Info I've only implemented the light_client_finality_update topic because I wanted to make sure I was on the correct path. Also checking that the gossiped LightClientFinalityUpdate is the same as the locally constructed one is not implemented because caching the updates will make this much easier. Could someone give me some feedback on this please? Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>	2022-12-13 06:24:51 +00:00
ethDreamer	1a39976715	Fixed Compiler Warnings & Failing Tests (#3771 )	2022-12-03 10:42:12 +11:00
Mark Mackey	8a04c3428e	Merged with `unstable`	2022-11-30 17:29:10 -06:00
GeemoCandama	3534c85e30	Optimize finalized chain sync by skipping newPayload messages (#3738 ) ## Issue Addressed #3704 ## Proposed Changes Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status. ## Additional Info I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.	2022-11-29 08:19:27 +00:00
antondlr	e9bf7f7cc1	remove commas from comma-separated kv pairs (#3737 ) ## Issue Addressed Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing	2022-11-25 07:57:10 +00:00
Giulio rebuffo	d5a2de759b	Added LightClientBootstrap V1 (#3711 ) ## Issue Addressed Partially addresses #3651 ## Proposed Changes Adds server-side support for light_client_bootstrap_v1 topic ## Additional Info This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice. Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-11-25 05:19:00 +00:00
Michael Sproul	788b337951	Op pool and gossip for BLS to execution changes (#3726 )	2022-11-25 07:09:26 +11:00
Divma	84c7d8cc70	Blocklookup data inconsistencies (#3677 ) ## Issue Addressed Closes #3649 ## Proposed Changes Add a regression test for the data inconsistency, catching the problem in `31e88c5533` [here](https://github.com/sigp/lighthouse/actions/runs/3379894044/jobs/5612044797#step:6:2043). When a chain is sent for processing, move it to a separate collection and now the test works, yay! ## Additional Info na	2022-11-07 06:48:34 +00:00
realbigsean	d8a49aad2b	merge with unstable fixes	2022-11-01 13:26:56 -04:00
realbigsean	8656d23327	merge with unstable	2022-11-01 13:18:00 -04:00
Pawan Dhananjay	29f2ec46d3	Couple blocks and blobs in gossip (#3670 ) * Revert "Add more gossip verification conditions" This reverts commit `1430b561c3`. * Revert "Add todos" This reverts commit `91efb9d4c7`. * Revert "Reprocess blob sidecar messages" This reverts commit `21bf3d37cd`. * Add the coupled topic * Decode SignedBeaconBlockAndBlobsSidecar correctly * Process Block and Blobs in beacon processor * Remove extra blob publishing logic from vc * Remove blob signing in vc * Ugly hack to compile	2022-11-01 10:28:21 -04:00
realbigsean	137f230344	Capella eip 4844 cleanup (#3652 ) * add capella gossip boiler plate * get everything compiling Co-authored-by: realbigsean <sean@sigmaprime.io Co-authored-by: Mark Mackey <mark@sigmaprime.io> * small cleanup * small cleanup * cargo fix + some test cleanup * improve block production * add fixme for potential panic Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-10-26 15:15:26 -04:00
ethDreamer	255fdf0724	Added Capella Data Structures to consensus/types (#3637 ) * Ran Cargo fmt * Added Capella Data Structures to consensus/types	2022-10-13 09:37:20 -05:00
realbigsean	44515b8cbe	cargo fix	2022-10-05 17:20:54 -04:00
Pawan Dhananjay	21bf3d37cd	Reprocess blob sidecar messages	2022-10-05 02:52:26 -05:00
Pawan Dhananjay	12fe514550	Add more gossip verification functions for blobs	2022-10-04 19:17:53 -05:00
realbigsean	7527c2b455	fix RPC limit add blob signing domain	2022-10-04 14:57:29 -04:00
realbigsean	ba16a037a3	cleanup	2022-10-04 09:34:05 -04:00
realbigsean	c0dc42ea07	cargo fmt	2022-10-04 08:21:46 -04:00
Divma	4926e3967f	[DEV FEATURE] Deterministic long lived subnets (#3453 ) ## Issue Addressed #2847 ## Proposed Changes Add under a feature flag the required changes to subscribe to long lived subnets in a deterministic way ## Additional Info There is an additional required change that is actually searching for peers using the prefix, but I find that it's best to make this change in the future	2022-10-04 10:37:48 +00:00
realbigsean	8d45e48775	cargo fix	2022-10-03 21:52:16 -04:00
realbigsean	e81dbbfea4	compile	2022-10-03 21:48:02 -04:00

1 2 3 4 5 ...

560 Commits