lighthouse

Author	SHA1	Message	Date
Divma	ffeb8b6e05	blacklist tests in windows (#3961 ) ## Issue Addressed Windows tests for subscription and unsubscriptions fail in CI sporadically. We usually ignore this failures, so this PR aims to help reduce the failure noise. Associated issue is https://github.com/sigp/lighthouse/issues/3960	2023-02-16 23:34:30 +00:00
Michael Sproul	461bda6e85	Execution engine suggestions from code review Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-02-16 16:54:05 +11:00
Michael Sproul	918b688f72	Simplify payload traits and reduce cloning (#3976 ) * Simplify payload traits and reduce cloning * Fix self limiter	2023-02-15 14:17:56 +11:00
Age Manning	8dd9249177	Enforce a timeout on peer disconnect (#3757 ) On heavily crowded networks, we are seeing many attempted connections to our node every second. Often these connections come from peers that have just been disconnected. This can be for a number of reasons including: - We have deemed them to be not as useful as other peers - They have performed poorly - They have dropped the connection with us - The connection was spontaneously lost - They were randomly removed because we have too many peers In all of these cases, if we have reached or exceeded our target peer limit, there is no desire to accept new connections immediately after the disconnect from these peers. In fact, it often costs us resources to handle the established connections and defeats some of the logic of dropping them in the first place. This PR adds a timeout, that prevents recently disconnected peers from reconnecting to us. Technically we implement a ban at the swarm layer to prevent immediate re connections for at least 10 minutes. I decided to keep this light, and use a time-based LRUCache which only gets updated during the peer manager heartbeat to prevent added stress of polling a delay map for what could be a large number of peers. This cache is bounded in time. An extra space bound could be added should people consider this a risk. Co-authored-by: Diva M <divma@protonmail.com>	2023-02-14 03:25:42 +00:00
Michael Sproul	f7bd4bf06e	Update block rewards API for Capella	2023-02-14 12:09:40 +11:00
Michael Sproul	d53ccf8fc7	Placeholder for BlobsByRange outbound rate limit	2023-02-14 12:08:14 +11:00
Michael Sproul	18c8cab4da	Merge remote-tracking branch 'origin/unstable' into capella-merge	2023-02-14 12:07:27 +11:00
Michael Sproul	2f456ff9eb	Fix regression in DB write atomicity (#3931 ) ## Issue Addressed Fix a bug introduced by #3696. The bug is not expected to occur frequently, so releasing this PR is non-urgent. ## Proposed Changes * Add a variant to `StoreOp` that allows a raw KV operation to be passed around. * Return to using `self.store.do_atomically` rather than `self.store.hot_db.do_atomically`. This streamlines the write back into a single call and makes our auto-revert work again. * Prevent `import_block_update_shuffling_cache` from failing block import. This is an outstanding bug from before v3.4.0 which may have contributed to some random unexplained database corruption. ## Additional Info In #3696 I split the database write into two calls, one to convert the `StoreOp`s to `KeyValueStoreOp`s and one to write them. This had the unfortunate side-effect of damaging our atomicity guarantees in case of a write error. If the first call failed, we would be left with the block in fork choice but not on-disk (or the snapshot cache), which would prevent us from processing any descendant blocks. On `unstable` the first call is very unlikely to fail unless the disk is full, but on `tree-states` the conversion is more involved and a user reported database corruption after it failed in a way that should have been recoverable. Additionally, as @emhane observed, #3696 also inadvertently removed the import of the new block into the block cache. Although this seems like it could have negatively impacted performance, there are several mitigating factors: - For regular block processing we should almost always load the parent block (and state) from the snapshot cache. - We often load blinded blocks, which bypass the block cache anyway. - Metrics show no noticeable increase in the block cache miss rate with v3.4.0. However, I expect the block cache _will_ be useful again in `tree-states`, so it is restored to use by this PR.	2023-02-13 03:32:01 +00:00
Paul Hauner	84843d67d7	Reduce some EE and builder related ERRO logs to WARN (#3966 ) ## Issue Addressed NA ## Proposed Changes Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf". The modified logs are: #### `ERRO Execution engine call failed` I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues. #### `ERRO "Builder failed to reveal payload"` In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting. I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise. #### `ERRO "Relay error when registering validator(s)"` It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it. #### `ERRO Error fetching block for peer ExecutionLayerErrorPayloadReconstruction` This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers. ## Additional Info NA	2023-02-12 23:14:08 +00:00
ethDreamer	e743d75c9b	Update Mock Builder for Post-Capella Tests (#3958 ) * Update Mock Builder for Post-Capella Tests * Add _mut Suffix to BidStuff Functions * Fix Setting Gas Limit	2023-02-10 13:30:14 -06:00
ethDreamer	39f8327f73	Properly Deserialize ForkVersionedResponses (#3944 ) * Move ForkVersionedResponse to consensus/types * Properly Deserialize ForkVersionedResponses * Elide Types in from_value Calls * Added Tests for ForkVersionedResponse Deserialize * Address Sean's Comments & Make Less Restrictive * Utilize `map_fork_name!`	2023-02-10 08:49:25 -06:00
Michael Sproul	c9354a9d25	Tweaks to reward APIs (#3957 ) ## Proposed Changes * Return the effective balance in gwei to align with the spec ([ideal attestation rewards](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards/getAttestationsRewards)). * Use quoted `i64`s for attestation and sync committee rewards.	2023-02-10 06:19:42 +00:00
Paul Hauner	5276dd0cb0	Fix edge-case when finding the finalized descendant (#3924 ) ## Issue Addressed NA ## Description We were missing an edge case when checking to see if a block is a descendant of the finalized checkpoint. This edge case is described for one of the tests in this PR: `a119edc739/consensus/proto_array/src/proto_array_fork_choice.rs (L1018-L1047)` This bug presented itself in the following mainnet log: ``` Jan 26 15:12:42.841 ERRO Unable to validate attestation error: MissingBeaconState(0x7c30cb80ec3d4ec624133abfa70e4c6cfecfca456bfbbbff3393e14e5b20bf25), peer_id: 16Uiu2HAm8RPRciXJYtYc5c3qtCRdrZwkHn2BXN3XP1nSi1gxHYit, type: "unaggregated", slot: Slot(5660161), beacon_block_root: 0x4a45e59da7cb9487f4836c83bdd1b741b4f31c67010c7ae343fa6771b3330489 ``` Here the BN is rejecting an attestation because of a "missing beacon state". Whilst it was correct to reject the attestation, it should have rejected it because it attests to a block that conflicts with finality rather than claiming that the database is inconsistent. The block that this attestation points to (`0x4a45`) is block `C` in the above diagram. It is a non-canonical block in the first slot of an epoch that conflicts with the finalized checkpoint. Due to our lazy pruning of proto array, `0x4a45` was still present in proto-array. Our missed edge-case in [`ForkChoice::is_descendant_of_finalized`](`38514c07f2/consensus/fork_choice/src/fork_choice.rs (L1375-L1379)`) would have indicated to us that the block is a descendant of the finalized block. Therefore, we would have accepted the attestation thinking that it attests to a descendant of the finalized checkpoint. Since we didn't have the shuffling for this erroneously processed block, we attempted to read its state from the database. This failed because we prune states from the database by keeping track of the tips of the chain and iterating back until we find a finalized block. This would have deleted `C` from the database, hence the `MissingBeaconState` error.	2023-02-09 23:51:18 +00:00
Divma	ceb986549d	Self rate limiting dev flag (#3928 ) ## Issue Addressed Adds self rate limiting options, mainly with the idea to comply with peer's rate limits in small testnets ## Proposed Changes Add a hidden flag `self-limiter` this can take no value, or customs values to configure quotas per protocol ## Additional Info ### How to use `--self-limiter` will turn on the self rate limiter applying the same params we apply to inbound requests (requests from other peers) `--self-limiter "beacon_blocks_by_range:64/1"` will turn on the self rate limiter for ALL protocols, but change the quota for bbrange to 64 requested blocks per 1 second. `--self-limiter "beacon_blocks_by_range:64/1;ping:1/10"` same as previous one, changing the quota for ping as well. ### Caveats - The rate limiter is either on or off for all protocols. I added the custom values to be able to change the quotas per protocol so that some protocols can be given extremely loose or tight quotas. I think this should satisfy every need even if we can't technically turn off rate limits per protocol. - This reuses the rate limiter struct for the inbound requests so there is this ugly part of the code in which we need to deal with the inbound only protocols (light client stuff) if this becomes too ugly as we add lc protocols, we might want to split the rate limiters. I've checked this and looks doable with const generics to avoid so much code duplication ### Knowing if this is on ``` Feb 06 21:12:05.493 DEBG Using self rate limiting params config: OutboundRateLimiterConfig { ping: 2/10s, metadata: 1/15s, status: 5/15s, goodbye: 1/10s, blocks_by_range: 1024/10s, blocks_by_root: 128/10s }, service: libp2p_rpc, service: libp2p ```	2023-02-08 02:18:53 +00:00
naviechan	9547ac069c	Implement block_rewards API (per-validator reward) (#3907 ) ## Issue Addressed [#3661](https://github.com/sigp/lighthouse/issues/3661) ## Proposed Changes `/eth/v1/beacon/rewards/blocks/{block_id}` ``` { "execution_optimistic": false, "finalized": false, "data": { "proposer_index": "123", "total": "123", "attestations": "123", "sync_aggregate": "123", "proposer_slashings": "123", "attester_slashings": "123" } } ``` The issue contains the implementation of three per-validator reward APIs: * `sync_committee_rewards` * [`attestation_rewards`](https://github.com/sigp/lighthouse/pull/3822) * `block_rewards` This PR only implements the `block_rewards`. The endpoints can be viewed in the Ethereum Beacon nodes API browser: [https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards) ## Additional Info The implementation of [consensus client reward APIs](https://github.com/eth-protocol-fellows/cohort-three/blob/master/projects/project-ideas.md#consensus-client-reward-apis) is part of the [EPF](https://github.com/eth-protocol-fellows/cohort-three). Co-authored-by: kevinbogner <kevbogner@gmail.com> Co-authored-by: navie <naviechan@gmail.com>	2023-02-07 08:33:23 +00:00
Paul Hauner	e062a7cf76	Broadcast address changes at Capella (#3919 ) * Add first efforts at broadcast * Tidy * Move broadcast code to client * Progress with broadcast impl * Rename to address change * Fix compile errors * Use `while` loop * Tidy * Flip broadcast condition * Switch to forgetting individual indices * Always broadcast when the node starts * Refactor into two functions * Add testing * Add another test * Tidy, add more testing * Tidy * Add test, rename enum * Rename enum again * Tidy * Break loop early * Add V15 schema migration * Bump schema version * Progress with migration * Update beacon_node/client/src/address_change_broadcast.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Fix typo in function name --------- Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-07 17:13:49 +11:00
kevinbogner	4d07e40501	Implement `attestation_rewards` API (per-validator reward) (#3822 ) ## Issue Addressed #3661 ## Proposed Changes `/eth/v1/beacon/rewards/attestations/{epoch}` ```json { "execution_optimistic": false, "finalized": false, "data": [ { "ideal_rewards": [ { "effective_balance": "1000000000", "head": "2500", "target": "5000", "source": "5000" } ], "total_rewards": [ { "validator_index": "0", "head": "2000", "target": "2000", "source": "4000", "inclusion_delay": "2000" } ] } ] } ``` The issue contains the implementation of three per-validator reward APIs: - [`sync_committee_rewards`](https://github.com/sigp/lighthouse/pull/3790) - `attestation_rewards` - `block_rewards`. This PR only implements the `attestation_rewards`. The endpoints can be viewed in the Ethereum Beacon nodes API browser: https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards ## Additional Info The implementation of [consensus client reward APIs](https://github.com/eth-protocol-fellows/cohort-three/blob/master/projects/project-ideas.md#consensus-client-reward-apis) is part of the [EPF](https://github.com/eth-protocol-fellows/cohort-three). --- - [x] `get_state` - [x] Calculate ideal rewards with some logic from `get_flag_index_deltas` - [x] Calculate actual rewards with some logic from `get_flag_index_deltas` - [x] Code cleanup - [x] Testing	2023-02-07 00:00:19 +00:00
ethDreamer	5b398b1990	Don't Reject all Builder Bids After Capella (#3940 ) * Fix bug in Builder API Post-Capella * Fix Clippy Complaints	2023-02-06 13:09:13 +11:00
ethDreamer	90b6ae62e6	Use Local Payload if More Profitable than Builder (#3934 ) * Use Local Payload if More Profitable than Builder * Rename clone -> clone_from_ref * Minimize Clones of GetPayloadResponse * Cleanup & Fix Tests * Added Tests for Payload Choice by Profit * Fix Outdated Comments	2023-02-01 19:37:46 -06:00
Mark Mackey	d842215a44	Merge branch 'upstream/unstable' into capella	2023-01-31 12:16:26 -06:00
ethDreamer	7b7595347d	exchangeCapabilities & Capella Readiness Logging (#3918 ) * Undo Passing Spec to Engine API * Utilize engine_exchangeCapabilities * Add Logging to Indicate Capella Readiness * Add exchangeCapabilities to mock_execution_layer * Send Nested Array for engine_exchangeCapabilities * Use Mutex Instead of RwLock for EngineCapabilities * Improve Locking to Avoid Deadlock * Prettier logic for get_engine_capabilities * Improve Comments * Update beacon_node/beacon_chain/src/capella_readiness.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/beacon_chain/src/capella_readiness.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/beacon_chain/src/capella_readiness.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/beacon_chain/src/capella_readiness.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/beacon_chain/src/capella_readiness.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/client/src/notifier.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/execution_layer/src/engine_api/http.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Addressed Michael's Comments --------- Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-01-31 18:26:23 +01:00
Michael Sproul	0866b739d0	Clippy 1.67 (#3916 ) ## Proposed Changes Clippy 1.67.0 put us on blast for the size of some of our errors, most of them written by me ( 👀 ). This PR shrinks the size of `BeaconChainError` by dropping some extraneous info and boxing an inner error which should only occur infrequently anyway. For the `AttestationSlashInfo` and `BlockSlashInfo` I opted to ignore the lint as they are always used in a `Result<A, Info>` where `A` is a similar size. This means they don't bloat the size of the `Result`, so it's a bit annoying for Clippy to report this as an issue. I also chose to ignore `clippy::uninlined-format-args` because I think the benefit-to-churn ratio is too low. E.g. sometimes we have long identifiers in `format!` args and IMO the non-inlined form is easier to read: ```rust // I prefer this... format!( "{} did {} to {}", REALLY_LONG_CONSTANT_NAME, ANOTHER_REALLY_LONG_CONSTANT_NAME, regular_long_identifier_name ); // To this format!("{REALLY_LONG_CONSTANT_NAME} did {ANOTHER_REALLY_LONG_CONSTANT_NAME} to {regular_long_identifier_name}"); ``` I tried generating an automatic diff with `cargo clippy --fix` but it came out at: ``` 250 files changed, 1209 insertions(+), 1469 deletions(-) ``` Which seems like a bad idea when we'd have to back-merge it to `capella` and `eip4844` 😱	2023-01-27 09:48:42 +00:00
Michael Sproul	16bdb2771b	Update another test broken by the shuffling change	2023-01-25 16:18:00 +11:00
Michael Sproul	e48487db01	Fix the new BLS to execution change test	2023-01-25 15:47:07 +11:00
Michael Sproul	79a20e8a5f	Update sync rewards API for abstract exec payload	2023-01-25 15:46:47 +11:00
Michael Sproul	c76a1971cc	Merge remote-tracking branch 'origin/unstable' into capella	2023-01-25 14:20:16 +11:00
GeemoCandama	a7351c00c0	light client optimistic update reprocessing (#3799 ) ## Issue Addressed Currently there is a race between receiving blocks and receiving light client optimistic updates (in unstable), which results in processing errors. This is a continuation of PR #3693 and seeks to progress on issue #3651 ## Proposed Changes Add the parent_root to ReprocessQueueMessage::BlockImported so we can remove blocks from queue when a block arrives that has the same parent root. We use the parent root as opposed to the block_root because the LightClientOptimisticUpdate does not contain the block_root. If light_client_optimistic_update.attested_header.canonical_root() != head_block.message().parent_root() then we queue the update. Otherwise we process immediately. ## Additional Info michaelsproul came up with this idea. The code was heavily based off of the attestation reprocessing. I have not properly tested this to see if it works as intended.	2023-01-24 22:17:50 +00:00
ethDreamer	3d4dd6af75	Use eth1_withdrawal_credentials in Test States (#3898 ) * Use eth1_withdrawal_credential in Some Test States * Update beacon_node/genesis/src/interop.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/genesis/src/interop.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Increase validator sizes * Pick next sync committee message Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-01-24 16:22:51 +01:00
naviechan	2802bc9a9c	Implement sync_committee_rewards API (per-validator reward) (#3903 ) ## Issue Addressed [#3661](https://github.com/sigp/lighthouse/issues/3661) ## Proposed Changes `/eth/v1/beacon/rewards/sync_committee/{block_id}` ``` { "execution_optimistic": false, "finalized": false, "data": [ { "validator_index": "0", "reward": "2000" } ] } ``` The issue contains the implementation of three per-validator reward APIs: * `sync_committee_rewards` * [`attestation_rewards`](https://github.com/sigp/lighthouse/pull/3822) * `block_rewards` This PR only implements the `sync_committe_rewards `. The endpoints can be viewed in the Ethereum Beacon nodes API browser: [https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards) ## Additional Info The implementation of [consensus client reward APIs](https://github.com/eth-protocol-fellows/cohort-three/blob/master/projects/project-ideas.md#consensus-client-reward-apis) is part of the [EPF](https://github.com/eth-protocol-fellows/cohort-three). Co-authored-by: navie <naviechan@gmail.com> Co-authored-by: kevinbogner <kevbogner@gmail.com>	2023-01-24 02:06:42 +00:00
Michael Sproul	d8abf2fc41	Import BLS to execution changes before Capella (#3892 ) * Import BLS to execution changes before Capella * Test for BLS to execution change HTTP API * Pack BLS to execution changes in LIFO order * Remove unused var * Clippy	2023-01-21 10:39:59 +11:00
Michael Sproul	bb0e99c097	Merge remote-tracking branch 'origin/unstable' into capella	2023-01-21 10:37:26 +11:00
Age Manning	f8a3b3b95a	Improve block delay metrics (#3894 ) We recently ran a large-block experiment on the testnet and plan to do a further experiment on mainnet. Although the metrics recovered from lighthouse nodes were quite useful, I think we could do with greater resolution in the block delay metrics and get some specific values for each block (currently these can be lost to large exponential histogram buckets). This PR increases the resolution of the block delay histogram buckets, but also introduces a new metric which records the last block delay. Depending on the polling resolution of the metric server, we can lose some block delay information, however it will always give us a specific value and we will not lose exact data based on poor resolution histogram buckets.	2023-01-20 00:46:56 +00:00
ethDreamer	26787412cd	Update engine_api to Latest spec (#3893 ) * Update engine_api to Latest spec * Small Test Fix * Fix Test Deserialization Issue	2023-01-19 22:42:17 +11:00
Santiago Medina	912ea2a5ca	Return HTTP 404 rather than 405 (#3836 ) ## Issue Addressed Issue #3112 ## Proposed Changes Add `Filter::recover` to the GET chain to handle rejections specifically as 404 NOT FOUND ## Additional Info Making a request to `http://localhost:5052/not_real` now returns the following: ``` { "code": 404, "message": "NOT_FOUND", "stacktraces": [] } ``` Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-01-16 03:42:09 +00:00
ethDreamer	51088725fb	CL-EL withdrawals harmonization using Gwei units (#3884 )	2023-01-16 12:03:42 +11:00
realbigsean	48a8cbbc45	Merge pull request #3877 from michaelsproul/withdrawals-block-hash Verify blockHash with withdrawals	2023-01-13 15:12:58 -05:00
realbigsean	1319683736	Update gossip_methods.rs	2023-01-13 14:59:03 -05:00
Mark Mackey	05c1291d8a	Don't Penalize Early `bls_to_execution_change`	2023-01-13 12:53:25 -06:00
Michael Sproul	8e2931d73b	Verify blockHash with withdrawals	2023-01-13 12:46:54 +11:00
Michael Sproul	aa896decc1	Fix some beacon_chain tests	2023-01-12 19:13:01 +11:00
Michael Sproul	2af8110529	Merge remote-tracking branch 'origin/unstable' into capella Fixing the conflicts involved patching up some of the `block_hash` verification, the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870	2023-01-12 16:22:00 +11:00
ethDreamer	52c1055fdc	Remove `withdrawals-processing` feature (#3864 ) * Use spec to Determine Supported Engine APIs * Remove `withdrawals-processing` feature * Fixed Tests * Missed Some Spots * Fixed Another Test * Stupid Clippy	2023-01-12 15:15:08 +11:00
Paul Hauner	38514c07f2	Release v3.4.0 (#3862 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info - [x] ~~Blocked on #3728, #3801~~ - [x] ~~Blocked on #3866~~ - [x] Requires additional testing	2023-01-11 03:27:08 +00:00
realbigsean	98b11bbd3f	add historical summaries (#3865 ) * add historical summaries * fix tree hash caching, disable the sanity slots test with fake crypto * add ssz static HistoricalSummary * only store historical summaries after capella * Teach `UpdatePattern` about Capella * Tidy EF tests * Clippy Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2023-01-11 12:40:21 +11:00
Paul Hauner	830efdb5c2	Improve validator monitor experience for high validator counts (#3728 ) ## Issue Addressed NA ## Proposed Changes Myself and others (#3678) have observed that when running with lots of validators (e.g., 1000s) the cardinality is too much for Prometheus. I've seen Prometheus instances just grind to a halt when we turn the validator monitor on for our testnet validators (we have 10,000s of Goerli validators). Additionally, the debug log volume can get very high with one log per validator, per attestation. To address this, the `bn --validator-monitor-individual-tracking-threshold <INTEGER>` flag has been added to disable per-validator (i.e., non-aggregated) metrics/logging once the validator monitor exceeds the threshold of validators. The default value is `64`, which is a finger-to-the-wind value. I don't actually know the value at which Prometheus starts to become overwhelmed, but I've seen it work with ~64 validators and I've seen it not work with 1000s of validators. A default of `64` seems like it will result in a breaking change to users who are running millions of dollars worth of validators whilst resulting in a no-op for low-validator-count users. I'm open to changing this number, though. Additionally, this PR starts collecting aggregated Prometheus metrics (e.g., total count of head hits across all validators), so that high-validator-count validators still have some interesting metrics. We already had logging for aggregated values, so nothing has been added there. I've opted to make this a breaking change since it can be rather damaging to your Prometheus instance to accidentally enable the validator monitor with large numbers of validators. I've crashed a Prometheus instance myself and had a report from another user who's done the same thing. ## Additional Info NA ## Breaking Changes Note A new label has been added to the validator monitor Prometheus metrics: `total`. This label tracks the aggregated metrics of all validators in the validator monitor (as opposed to each validator being tracking individually using its pubkey as the label). Additionally, a new flag has been added to the Beacon Node: `--validator-monitor-individual-tracking-threshold`. The default value is `64`, which means that when the validator monitor is tracking more than 64 validators then it will stop tracking per-validator metrics and only track the `all_validators` metric. It will also stop logging per-validator logs and only emit aggregated logs (the exception being that exit and slashing logs are always emitted). These changes were introduced in #3728 to address issues with untenable Prometheus cardinality and log volume when using the validator monitor with high validator counts (e.g., 1000s of validators). Users with less than 65 validators will see no change in behavior (apart from the added `all_validators` metric). Users with more than 65 validators who wish to maintain the previous behavior can set something like `--validator-monitor-individual-tracking-threshold 999999`.	2023-01-09 08:18:55 +00:00
Michael Sproul	4bd2b777ec	Verify execution block hashes during finalized sync (#3794 ) ## Issue Addressed Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible. ## Proposed Changes I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client. I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library). Block hash verification is pretty quick, about 500us per block on my machine (mainnet). The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL. ## Additional Info This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it. I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.	2023-01-09 03:11:59 +00:00
ethDreamer	11f4784ae6	Added bls_to_execution_changes to PersistedOpPool (#3857 ) * Added bls_to_execution_changes to PersistedOpPool	2023-01-09 12:38:02 +11:00
ethDreamer	cb94f639b0	Isolate withdrawals-processing Feature (#3854 )	2023-01-09 11:05:28 +11:00
ethDreamer	6b72f45cad	Merge pull request #3845 from realbigsean/capella-cleanup Capella cleanup	2023-01-06 13:26:41 -06:00
Age Manning	1d9a2022b4	Upgrade to libp2p v0.50.0 (#3764 ) I've needed to do this work in order to do some episub testing. This version of libp2p has not yet been released, so this is left as a draft for when we wish to update. Co-authored-by: Diva M <divma@protonmail.com>	2023-01-06 15:59:33 +00:00
Mark Mackey	2ac609b64e	Fixing Moar Failing Tests	2023-01-05 13:00:44 -06:00
Age Manning	4e5e7ee1fc	Restructure code for libp2p upgrade (#3850 ) Our custom RPC implementation is lagging from the libp2p v50 version. We are going to need to change a bunch of function names and would be nice to have consistent ordering of function names inside the handlers. This is a precursor to the libp2p upgrade to minimize merge conflicts in function ordering.	2023-01-05 17:18:24 +00:00
Mark Mackey	8711db2f3b	Fix EF Tests	2023-01-04 15:14:43 -06:00
Mark Mackey	933772dd06	Fixed Operation Pool Tests	2023-01-03 18:40:35 -06:00
Mark Mackey	be232c4587	Update Execution Layer Tests for Capella	2023-01-03 16:58:15 -06:00
realbigsean	4353c49855	Update beacon_node/execution_layer/src/engine_api/json_structures.rs	2023-01-03 08:55:19 -05:00
realbigsean	d8f7277beb	cleanup	2022-12-30 11:00:14 -05:00
Mark Mackey	986ae4360a	Fix clippy complaints	2022-12-28 14:47:16 -06:00
Mark Mackey	c188cde034	merge upstream/unstable	2022-12-28 14:43:25 -06:00
Mark Mackey	c922566fbc	Fixed Some Tests	2022-12-27 15:59:34 -06:00
Mark Mackey	96da8b9383	Feature Guard V2 Engine API Methods	2022-12-27 15:55:43 -06:00
Mark Mackey	b75ca74222	Removed `withdrawals` feature flag	2022-12-19 15:38:46 -06:00
Mark Mackey	3a08c7634e	Make engine_getPayloadV2 accept local block value	2022-12-16 15:44:55 -06:00
Divma	ffbf70e2d9	Clippy lints for rust 1.66 (#3810 ) ## Issue Addressed Fixes the new clippy lints for rust 1.66 ## Proposed Changes Most of the changes come from: - [unnecessary_cast](https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast) - [iter_kv_map](https://rust-lang.github.io/rust-clippy/master/index.html#iter_kv_map) - [needless_borrow](https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow) ## Additional Info na	2022-12-16 04:04:00 +00:00
Mark Mackey	3e90fb8cae	Merge branch 'unstable' into capella	2022-12-15 12:20:03 -06:00
Divma	63c74b37f4	send error answering bbrange requests when an error occurrs (#3800 ) ## Issue Addressed While testing withdrawals with @ethDreamer we noticed lighthouse is sending empty batches when an error occurs. As LH peer receiving this, we would consider this a low tolerance action because the peer is claiming the batch is right and is empty. ## Proposed Changes If any kind of error occurs, send a error response instead ## Additional Info Right now we don't handle such thing as a partial batch with an error. If an error is received, the whole batch is discarded. Because of this it makes little sense to send partial batches that end with an error, so it's better to do the proposed solution instead of sending empty batches.	2022-12-15 00:16:38 +00:00
Michael Sproul	f3e8ca852e	Fix Clippy	2022-12-14 14:04:13 +11:00
Michael Sproul	991e4094f8	Merge remote-tracking branch 'origin/unstable' into capella-update	2022-12-14 13:00:41 +11:00
Michael Sproul	63d3dd27fc	Batch API for address changes (#3798 )	2022-12-14 12:01:33 +11:00
Michael Sproul	75dd8780e0	Use JsonPayload for payload reconstruction (#3797 )	2022-12-14 11:52:46 +11:00
ethDreamer	07d6ef749a	Fixed Payload Reconstruction Bug (#3796 )	2022-12-14 11:49:30 +11:00
ethDreamer	b1c33361ea	Fixed Clippy Complaints & Some Failing Tests (#3791 ) * Fixed Clippy Complaints & Some Failing Tests * Update Dockerfile to Rust-1.65 * EF test file renamed * Touch up comments based on feedback	2022-12-13 10:50:24 -06:00
Michael Sproul	775d222299	Enable proposer boost re-orging (#2860 ) ## Proposed Changes With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004 There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>	2022-12-13 09:57:26 +00:00
Paul Hauner	6f79263a21	Make all validator monitor logs `INFO` (#3727 ) ## Issue Addressed NA ## Proposed Changes This is a potentially contentious change, but I find it annoying that the validator monitor logs `WARN` and `ERRO` for imperfect attestations. Perfect attestation performance is unachievable (don't believe those photo-shopped beauty magazines!) since missed and poorly-packed blocks by other validators will reduce your performance. When the validator monitor is on with 10s or more validators, I find the logs are washed out with ERROs that are not worth investigating. I suspect that users who really want to know if validators are missing attestations can do so by matching the content of the log, rather than the log level. I'm open to feedback about this, especially from anyone who is relying on the current log levels. ## Additional Info NA ## Breaking Changes Notes The validator monitor will no longer emit `WARN` and `ERRO` logs for sub-optimal attestation performance. The logs will now be emitted at `INFO` level. This change was introduced to avoid cluttering the `WARN` and `ERRO` logs with alerts that are frequently triggered by the actions of other network participants (e.g., a missed block) and require no action from the user.	2022-12-13 06:24:52 +00:00
GeemoCandama	1b28ef8a8d	Adding light_client gossip topics (#3693 ) ## Issue Addressed Implementing the light_client_gossip topics but I'm not there yet. Which issue # does this PR address? Partially #3651 ## Proposed Changes Add light client gossip topics. Please list or describe the changes introduced by this PR. I'm going to Implement light_client_finality_update and light_client_optimistic_update gossip topics. Currently I've attempted the former and I'm seeking feedback. ## Additional Info I've only implemented the light_client_finality_update topic because I wanted to make sure I was on the correct path. Also checking that the gossiped LightClientFinalityUpdate is the same as the locally constructed one is not implemented because caching the updates will make this much easier. Could someone give me some feedback on this please? Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>	2022-12-13 06:24:51 +00:00
Michael Sproul	173a0abab4	Fix `Withdrawal` serialisation and check address change fork (#3789 ) * Disallow address changes before Capella * Quote u64s in Withdrawal serialisation	2022-12-13 17:03:21 +11:00
Justin Traglia	f7a54afde5	Fix some capella nits (#3782 )	2022-12-12 11:40:44 +11:00
Paul Hauner	c973bfc90c	Reduce log severity for late and unrevealed blocks (#3775 ) ## Issue Addressed NA ## Proposed Changes In #3725 I introduced a `CRIT` log for unrevealed payloads, against @michaelsproul's [advice](https://github.com/sigp/lighthouse/pull/3725#discussion_r1034142113). After being woken up in the middle of the night by a block that was not revealed to the BN but was revealed to the network, I have capitulated. This PR implements @michaelsproul's suggestion and reduces the severity to `ERRO`. Additionally, I have dropped a `CRIT` to an `ERRO` for when a block is published late. The block in question was indeed published late on the network, however now that we have builders that can slow down block production I don't think the error is "actionable" enough to warrant a `CRIT` for the user. ## Additional Info NA	2022-12-10 00:45:18 +00:00
Mac L	8cb9b5e126	Expose certain `validator_monitor` metrics to the HTTP API (#3760 ) ## Issue Addressed #3724 ## Proposed Changes Exposes certain `validator_monitor` as an endpoint on the HTTP API. Will only return metrics for validators which are actively being monitored. ### Usage ```bash curl -X GET "http://localhost:5052/lighthouse/ui/validator_metrics" -H "accept: application/json" \| jq ``` ```json { "data": { "validators": { "12345": { "attestation_hits": 10, "attestation_misses": 0, "attestation_hit_percentage": 100, "attestation_head_hits": 10, "attestation_head_misses": 0, "attestation_head_hit_percentage": 100, "attestation_target_hits": 5, "attestation_target_misses": 5, "attestation_target_hit_percentage": 50 } } } } ``` ## Additional Info Based on #3756 which should be merged first.	2022-12-09 06:39:19 +00:00
ethDreamer	b6486e809d	Fixed moar tests (#3774 )	2022-12-05 09:08:55 +11:00
ethDreamer	5282e200be	Merge 'upstream/unstable' into capella (#3773 ) * Add API endpoint to count statuses of all validators (#3756) * Delete DB schema migrations for v11 and earlier (#3761) Co-authored-by: Mac L <mjladson@pm.me> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2022-12-03 14:05:25 -06:00
ethDreamer	1a39976715	Fixed Compiler Warnings & Failing Tests (#3771 )	2022-12-03 10:42:12 +11:00
Michael Sproul	84392d63fa	Delete DB schema migrations for v11 and earlier (#3761 ) ## Proposed Changes Now that the Gnosis merge is scheduled, all users should have upgraded beyond Lighthouse v3.0.0. Accordingly we can delete schema migrations for versions prior to v3.0.0. ## Additional Info I also deleted the state cache stuff I added in #3714 as it turned out to be useless for the light client proofs due to the one-slot offset.	2022-12-02 00:07:43 +00:00
Mac L	18c9be595d	Add API endpoint to count statuses of all validators (#3756 ) ## Issue Addressed #3724 ## Proposed Changes Adds an endpoint to quickly count the number of occurances of each status in the validator set. ## Usage ```bash curl -X GET "http://localhost:5052/lighthouse/ui/validator_count" -H "accept: application/json" \| jq ``` ```json { "data": { "active_ongoing":479508, "active_exiting":0, "active_slashed":0, "pending_initialized":28, "pending_queued":0, "withdrawal_possible":933, "withdrawal_done":0, "exited_unslashed":0, "exited_slashed":3 } } ```	2022-12-01 06:03:53 +00:00
Mark Mackey	8a04c3428e	Merged with `unstable`	2022-11-30 17:29:10 -06:00
Michael Sproul	22115049ee	Prioritise important parts of block processing (#3696 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2327 ## Proposed Changes This is an extension of some ideas I implemented while working on `tree-states`: - Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over. - Clean up `import_block` by splitting each part into `import_block_XXX`. - Move some stuff off hot paths, specifically: - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming: - Attestation observation - Validator monitor updates - Slasher updates - Updating the shuffling cache - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms). - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache. The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-11-30 05:22:58 +00:00
Divma	b4f4c0d253	Ipv6 bootnodes (#3752 ) ## Issue Addressed our bootnodes as of now support only ipv4. this makes it so that they support ipv6 ## Proposed Changes - Adds code necessary to update the bootnodes to run on dual stack nodes and therefore contact and store ipv6 nodes. - Adds some metrics about connectivity type of stored peers. It might have been nice to see some metrics over the sessions but that feels out of scope right now. ## Additional Info - some code quality improvements sneaked in since the changes seemed small - I think it depends on the OS, but enabling mapped addresses on an ipv6 node without dual stack support enabled could fail silently, making these nodes effectively ipv6 only. In the future I'll probably change this to use two sockets, which should fail loudly	2022-11-30 03:21:35 +00:00
Mark Mackey	f5e6a54f05	Refactored Execution Layer & Fixed Some Tests	2022-11-29 18:18:33 -06:00
Mark Mackey	36170ec428	Fixed some BeaconChain Tests	2022-11-29 18:18:18 -06:00
Mark Mackey	e0ea26c228	Remove withdrawals guard for PayloadAttributesV2	2022-11-29 18:03:29 -06:00
ethDreamer	342489a0c3	Fixed Payload Deserialization in DB (#3758 )	2022-11-30 10:27:13 +11:00
GeemoCandama	3534c85e30	Optimize finalized chain sync by skipping newPayload messages (#3738 ) ## Issue Addressed #3704 ## Proposed Changes Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status. ## Additional Info I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.	2022-11-29 08:19:27 +00:00
Paul Hauner	a2969ba7de	Improve debugging experience for builder proposals (#3725 ) ## Issue Addressed NA ## Proposed Changes This PR sets out to improve the logging/metrics experience when interacting with the builder. Namely, it: - Adds/changes metrics (see "Metrics Changes" section). - Adds new logs which show the duration of requests to the builder/local EL. - Refactors existing logs for consistency and so that the `parent_hash` is include in all relevant logs (we can grep for this field when trying to trace the flow of block production). Additionally, when I was implementing this PR I noticed that we skip some verification of the builder payload in the scenario where the builder return `Ok` but the local EL returns with `Err`. Namely, we were skipping the bid signature and other values like parent hash and prev randao. In this PR I've changed it so we always check these values and reject the bid if they're incorrect. With these changes, we'll sometimes choose to skip a proposal rather than propose something invalid -- that's the only side-effect to the changes that I can see. ## Metrics Changes - Changed: `execution_layer_request_times`: - `method = "get_blinded_payload_local"`: time taken to get a payload from a local EE. - `method = "get_blinded_payload_builder"`: time taken to get a blinded payload from a builder. - `method = "post_blinded_payload_builder"`: time taken to get a builder to reveal a payload they've previously supplied us. - `execution_layer_get_payload_outcome` - `outcome = "success"`: we successfully produced a payload from a builder or local EE. - `outcome = "failure"`: we were unable to get a payload from a builder or local EE. - New: `execution_layer_builder_reveal_payload_outcome` - `outcome = "success"`: a builder revealed a payload from a signed, blinded block. - `outcome = "failure"`: the builder did not reveal the payload. - New: `execution_layer_get_payload_source` - `type = "builder"`: we used a payload from a builder to produce a block. - `type = "local"`: we used a payload from a local EE to produce a block. - New: `execution_layer_get_payload_builder_rejections` has a `reason` field to describe why we rejected a payload from a builder. - New: `execution_layer_payload_bids` tracks the bid (in gwei) from the builder or local EE (local EE not yet supported, waiting on EEs to expose the value). Can only record values that fit inside an i64 (roughly 9 million ETH). ## Additional Info NA	2022-11-29 05:51:42 +00:00
Age Manning	2779017076	Gossipsub fast message id change (#3755 ) For improved consistency, this mixes in the topic into our fast message id for more consistent tracking of messages across topics.	2022-11-28 07:36:52 +00:00
Mac L	c881b80367	Add CLI flag for gui requirements (#3731 ) ## Issue Addressed #3723 ## Proposed Changes Adds a new CLI flag `--gui` which enables all the various flags required for the gui to function properly. Currently enables the `--http` and `--validator-monitor-auto` flags.	2022-11-28 00:22:53 +00:00
antondlr	e9bf7f7cc1	remove commas from comma-separated kv pairs (#3737 ) ## Issue Addressed Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing	2022-11-25 07:57:10 +00:00
Giulio rebuffo	d5a2de759b	Added LightClientBootstrap V1 (#3711 ) ## Issue Addressed Partially addresses #3651 ## Proposed Changes Adds server-side support for light_client_bootstrap_v1 topic ## Additional Info This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice. Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-11-25 05:19:00 +00:00
Michael Sproul	788b337951	Op pool and gossip for BLS to execution changes (#3726 )	2022-11-25 07:09:26 +11:00
realbigsean	58b54f0a53	Rename excess blobs and update 4844 json RPC serialization/deserialization (#3745 ) * rename excess blobs and fix json serialization/deserialization * remove coments	2022-11-24 16:41:35 +11:00
Michael Sproul	e3ccd8fd4a	Two Capella bugfixes (#3749 ) * Two Capella bugfixes * fix payload default check in fork choice * Revert "fix payload default check in fork choice" This reverts commit `e56fefbd05`. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-11-24 15:14:06 +11:00

1 2 3 4 5 ...

2217 Commits