lighthouse

Author	SHA1	Message	Date
Michael Sproul	aa896decc1	Fix some beacon_chain tests	2023-01-12 19:13:01 +11:00
Michael Sproul	2af8110529	Merge remote-tracking branch 'origin/unstable' into capella Fixing the conflicts involved patching up some of the `block_hash` verification, the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870	2023-01-12 16:22:00 +11:00
ethDreamer	52c1055fdc	Remove `withdrawals-processing` feature (#3864 ) * Use spec to Determine Supported Engine APIs * Remove `withdrawals-processing` feature * Fixed Tests * Missed Some Spots * Fixed Another Test * Stupid Clippy	2023-01-12 15:15:08 +11:00
Paul Hauner	38514c07f2	Release v3.4.0 (#3862 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info - [x] ~~Blocked on #3728, #3801~~ - [x] ~~Blocked on #3866~~ - [x] Requires additional testing	2023-01-11 03:27:08 +00:00
realbigsean	98b11bbd3f	add historical summaries (#3865 ) * add historical summaries * fix tree hash caching, disable the sanity slots test with fake crypto * add ssz static HistoricalSummary * only store historical summaries after capella * Teach `UpdatePattern` about Capella * Tidy EF tests * Clippy Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2023-01-11 12:40:21 +11:00
Paul Hauner	830efdb5c2	Improve validator monitor experience for high validator counts (#3728 ) ## Issue Addressed NA ## Proposed Changes Myself and others (#3678) have observed that when running with lots of validators (e.g., 1000s) the cardinality is too much for Prometheus. I've seen Prometheus instances just grind to a halt when we turn the validator monitor on for our testnet validators (we have 10,000s of Goerli validators). Additionally, the debug log volume can get very high with one log per validator, per attestation. To address this, the `bn --validator-monitor-individual-tracking-threshold <INTEGER>` flag has been added to disable per-validator (i.e., non-aggregated) metrics/logging once the validator monitor exceeds the threshold of validators. The default value is `64`, which is a finger-to-the-wind value. I don't actually know the value at which Prometheus starts to become overwhelmed, but I've seen it work with ~64 validators and I've seen it not work with 1000s of validators. A default of `64` seems like it will result in a breaking change to users who are running millions of dollars worth of validators whilst resulting in a no-op for low-validator-count users. I'm open to changing this number, though. Additionally, this PR starts collecting aggregated Prometheus metrics (e.g., total count of head hits across all validators), so that high-validator-count validators still have some interesting metrics. We already had logging for aggregated values, so nothing has been added there. I've opted to make this a breaking change since it can be rather damaging to your Prometheus instance to accidentally enable the validator monitor with large numbers of validators. I've crashed a Prometheus instance myself and had a report from another user who's done the same thing. ## Additional Info NA ## Breaking Changes Note A new label has been added to the validator monitor Prometheus metrics: `total`. This label tracks the aggregated metrics of all validators in the validator monitor (as opposed to each validator being tracking individually using its pubkey as the label). Additionally, a new flag has been added to the Beacon Node: `--validator-monitor-individual-tracking-threshold`. The default value is `64`, which means that when the validator monitor is tracking more than 64 validators then it will stop tracking per-validator metrics and only track the `all_validators` metric. It will also stop logging per-validator logs and only emit aggregated logs (the exception being that exit and slashing logs are always emitted). These changes were introduced in #3728 to address issues with untenable Prometheus cardinality and log volume when using the validator monitor with high validator counts (e.g., 1000s of validators). Users with less than 65 validators will see no change in behavior (apart from the added `all_validators` metric). Users with more than 65 validators who wish to maintain the previous behavior can set something like `--validator-monitor-individual-tracking-threshold 999999`.	2023-01-09 08:18:55 +00:00
Michael Sproul	4bd2b777ec	Verify execution block hashes during finalized sync (#3794 ) ## Issue Addressed Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible. ## Proposed Changes I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client. I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library). Block hash verification is pretty quick, about 500us per block on my machine (mainnet). The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL. ## Additional Info This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it. I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.	2023-01-09 03:11:59 +00:00
ethDreamer	11f4784ae6	Added bls_to_execution_changes to PersistedOpPool (#3857 ) * Added bls_to_execution_changes to PersistedOpPool	2023-01-09 12:38:02 +11:00
ethDreamer	cb94f639b0	Isolate withdrawals-processing Feature (#3854 )	2023-01-09 11:05:28 +11:00
ethDreamer	6b72f45cad	Merge pull request #3845 from realbigsean/capella-cleanup Capella cleanup	2023-01-06 13:26:41 -06:00
Age Manning	1d9a2022b4	Upgrade to libp2p v0.50.0 (#3764 ) I've needed to do this work in order to do some episub testing. This version of libp2p has not yet been released, so this is left as a draft for when we wish to update. Co-authored-by: Diva M <divma@protonmail.com>	2023-01-06 15:59:33 +00:00
Mark Mackey	2ac609b64e	Fixing Moar Failing Tests	2023-01-05 13:00:44 -06:00
Age Manning	4e5e7ee1fc	Restructure code for libp2p upgrade (#3850 ) Our custom RPC implementation is lagging from the libp2p v50 version. We are going to need to change a bunch of function names and would be nice to have consistent ordering of function names inside the handlers. This is a precursor to the libp2p upgrade to minimize merge conflicts in function ordering.	2023-01-05 17:18:24 +00:00
Mark Mackey	8711db2f3b	Fix EF Tests	2023-01-04 15:14:43 -06:00
Mark Mackey	933772dd06	Fixed Operation Pool Tests	2023-01-03 18:40:35 -06:00
Mark Mackey	be232c4587	Update Execution Layer Tests for Capella	2023-01-03 16:58:15 -06:00
realbigsean	4353c49855	Update beacon_node/execution_layer/src/engine_api/json_structures.rs	2023-01-03 08:55:19 -05:00
realbigsean	d8f7277beb	cleanup	2022-12-30 11:00:14 -05:00
Mark Mackey	986ae4360a	Fix clippy complaints	2022-12-28 14:47:16 -06:00
Mark Mackey	c188cde034	merge upstream/unstable	2022-12-28 14:43:25 -06:00
Mark Mackey	c922566fbc	Fixed Some Tests	2022-12-27 15:59:34 -06:00
Mark Mackey	96da8b9383	Feature Guard V2 Engine API Methods	2022-12-27 15:55:43 -06:00
Mark Mackey	b75ca74222	Removed `withdrawals` feature flag	2022-12-19 15:38:46 -06:00
Mark Mackey	3a08c7634e	Make engine_getPayloadV2 accept local block value	2022-12-16 15:44:55 -06:00
Divma	ffbf70e2d9	Clippy lints for rust 1.66 (#3810 ) ## Issue Addressed Fixes the new clippy lints for rust 1.66 ## Proposed Changes Most of the changes come from: - [unnecessary_cast](https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast) - [iter_kv_map](https://rust-lang.github.io/rust-clippy/master/index.html#iter_kv_map) - [needless_borrow](https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow) ## Additional Info na	2022-12-16 04:04:00 +00:00
Mark Mackey	3e90fb8cae	Merge branch 'unstable' into capella	2022-12-15 12:20:03 -06:00
Divma	63c74b37f4	send error answering bbrange requests when an error occurrs (#3800 ) ## Issue Addressed While testing withdrawals with @ethDreamer we noticed lighthouse is sending empty batches when an error occurs. As LH peer receiving this, we would consider this a low tolerance action because the peer is claiming the batch is right and is empty. ## Proposed Changes If any kind of error occurs, send a error response instead ## Additional Info Right now we don't handle such thing as a partial batch with an error. If an error is received, the whole batch is discarded. Because of this it makes little sense to send partial batches that end with an error, so it's better to do the proposed solution instead of sending empty batches.	2022-12-15 00:16:38 +00:00
Michael Sproul	f3e8ca852e	Fix Clippy	2022-12-14 14:04:13 +11:00
Michael Sproul	991e4094f8	Merge remote-tracking branch 'origin/unstable' into capella-update	2022-12-14 13:00:41 +11:00
Michael Sproul	63d3dd27fc	Batch API for address changes (#3798 )	2022-12-14 12:01:33 +11:00
Michael Sproul	75dd8780e0	Use JsonPayload for payload reconstruction (#3797 )	2022-12-14 11:52:46 +11:00
ethDreamer	07d6ef749a	Fixed Payload Reconstruction Bug (#3796 )	2022-12-14 11:49:30 +11:00
ethDreamer	b1c33361ea	Fixed Clippy Complaints & Some Failing Tests (#3791 ) * Fixed Clippy Complaints & Some Failing Tests * Update Dockerfile to Rust-1.65 * EF test file renamed * Touch up comments based on feedback	2022-12-13 10:50:24 -06:00
Michael Sproul	775d222299	Enable proposer boost re-orging (#2860 ) ## Proposed Changes With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks. This PR adds three flags to the BN to control this behaviour: * `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default). * `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled. * `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally. For safety Lighthouse will only attempt a re-org under very specific conditions: 1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`. 2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes. 3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense. 4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost. ## Additional Info For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004 There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034 Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: pawan <pawandhananjay@gmail.com>	2022-12-13 09:57:26 +00:00
Paul Hauner	6f79263a21	Make all validator monitor logs `INFO` (#3727 ) ## Issue Addressed NA ## Proposed Changes This is a potentially contentious change, but I find it annoying that the validator monitor logs `WARN` and `ERRO` for imperfect attestations. Perfect attestation performance is unachievable (don't believe those photo-shopped beauty magazines!) since missed and poorly-packed blocks by other validators will reduce your performance. When the validator monitor is on with 10s or more validators, I find the logs are washed out with ERROs that are not worth investigating. I suspect that users who really want to know if validators are missing attestations can do so by matching the content of the log, rather than the log level. I'm open to feedback about this, especially from anyone who is relying on the current log levels. ## Additional Info NA ## Breaking Changes Notes The validator monitor will no longer emit `WARN` and `ERRO` logs for sub-optimal attestation performance. The logs will now be emitted at `INFO` level. This change was introduced to avoid cluttering the `WARN` and `ERRO` logs with alerts that are frequently triggered by the actions of other network participants (e.g., a missed block) and require no action from the user.	2022-12-13 06:24:52 +00:00
GeemoCandama	1b28ef8a8d	Adding light_client gossip topics (#3693 ) ## Issue Addressed Implementing the light_client_gossip topics but I'm not there yet. Which issue # does this PR address? Partially #3651 ## Proposed Changes Add light client gossip topics. Please list or describe the changes introduced by this PR. I'm going to Implement light_client_finality_update and light_client_optimistic_update gossip topics. Currently I've attempted the former and I'm seeking feedback. ## Additional Info I've only implemented the light_client_finality_update topic because I wanted to make sure I was on the correct path. Also checking that the gossiped LightClientFinalityUpdate is the same as the locally constructed one is not implemented because caching the updates will make this much easier. Could someone give me some feedback on this please? Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>	2022-12-13 06:24:51 +00:00
Michael Sproul	173a0abab4	Fix `Withdrawal` serialisation and check address change fork (#3789 ) * Disallow address changes before Capella * Quote u64s in Withdrawal serialisation	2022-12-13 17:03:21 +11:00
Justin Traglia	f7a54afde5	Fix some capella nits (#3782 )	2022-12-12 11:40:44 +11:00
Paul Hauner	c973bfc90c	Reduce log severity for late and unrevealed blocks (#3775 ) ## Issue Addressed NA ## Proposed Changes In #3725 I introduced a `CRIT` log for unrevealed payloads, against @michaelsproul's [advice](https://github.com/sigp/lighthouse/pull/3725#discussion_r1034142113). After being woken up in the middle of the night by a block that was not revealed to the BN but was revealed to the network, I have capitulated. This PR implements @michaelsproul's suggestion and reduces the severity to `ERRO`. Additionally, I have dropped a `CRIT` to an `ERRO` for when a block is published late. The block in question was indeed published late on the network, however now that we have builders that can slow down block production I don't think the error is "actionable" enough to warrant a `CRIT` for the user. ## Additional Info NA	2022-12-10 00:45:18 +00:00
Mac L	8cb9b5e126	Expose certain `validator_monitor` metrics to the HTTP API (#3760 ) ## Issue Addressed #3724 ## Proposed Changes Exposes certain `validator_monitor` as an endpoint on the HTTP API. Will only return metrics for validators which are actively being monitored. ### Usage ```bash curl -X GET "http://localhost:5052/lighthouse/ui/validator_metrics" -H "accept: application/json" \| jq ``` ```json { "data": { "validators": { "12345": { "attestation_hits": 10, "attestation_misses": 0, "attestation_hit_percentage": 100, "attestation_head_hits": 10, "attestation_head_misses": 0, "attestation_head_hit_percentage": 100, "attestation_target_hits": 5, "attestation_target_misses": 5, "attestation_target_hit_percentage": 50 } } } } ``` ## Additional Info Based on #3756 which should be merged first.	2022-12-09 06:39:19 +00:00
ethDreamer	b6486e809d	Fixed moar tests (#3774 )	2022-12-05 09:08:55 +11:00
ethDreamer	5282e200be	Merge 'upstream/unstable' into capella (#3773 ) * Add API endpoint to count statuses of all validators (#3756) * Delete DB schema migrations for v11 and earlier (#3761) Co-authored-by: Mac L <mjladson@pm.me> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2022-12-03 14:05:25 -06:00
ethDreamer	1a39976715	Fixed Compiler Warnings & Failing Tests (#3771 )	2022-12-03 10:42:12 +11:00
Michael Sproul	84392d63fa	Delete DB schema migrations for v11 and earlier (#3761 ) ## Proposed Changes Now that the Gnosis merge is scheduled, all users should have upgraded beyond Lighthouse v3.0.0. Accordingly we can delete schema migrations for versions prior to v3.0.0. ## Additional Info I also deleted the state cache stuff I added in #3714 as it turned out to be useless for the light client proofs due to the one-slot offset.	2022-12-02 00:07:43 +00:00
Mac L	18c9be595d	Add API endpoint to count statuses of all validators (#3756 ) ## Issue Addressed #3724 ## Proposed Changes Adds an endpoint to quickly count the number of occurances of each status in the validator set. ## Usage ```bash curl -X GET "http://localhost:5052/lighthouse/ui/validator_count" -H "accept: application/json" \| jq ``` ```json { "data": { "active_ongoing":479508, "active_exiting":0, "active_slashed":0, "pending_initialized":28, "pending_queued":0, "withdrawal_possible":933, "withdrawal_done":0, "exited_unslashed":0, "exited_slashed":3 } } ```	2022-12-01 06:03:53 +00:00
Mark Mackey	8a04c3428e	Merged with `unstable`	2022-11-30 17:29:10 -06:00
Michael Sproul	22115049ee	Prioritise important parts of block processing (#3696 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2327 ## Proposed Changes This is an extension of some ideas I implemented while working on `tree-states`: - Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over. - Clean up `import_block` by splitting each part into `import_block_XXX`. - Move some stuff off hot paths, specifically: - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming: - Attestation observation - Validator monitor updates - Slasher updates - Updating the shuffling cache - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms). - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache. The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-11-30 05:22:58 +00:00
Divma	b4f4c0d253	Ipv6 bootnodes (#3752 ) ## Issue Addressed our bootnodes as of now support only ipv4. this makes it so that they support ipv6 ## Proposed Changes - Adds code necessary to update the bootnodes to run on dual stack nodes and therefore contact and store ipv6 nodes. - Adds some metrics about connectivity type of stored peers. It might have been nice to see some metrics over the sessions but that feels out of scope right now. ## Additional Info - some code quality improvements sneaked in since the changes seemed small - I think it depends on the OS, but enabling mapped addresses on an ipv6 node without dual stack support enabled could fail silently, making these nodes effectively ipv6 only. In the future I'll probably change this to use two sockets, which should fail loudly	2022-11-30 03:21:35 +00:00
Mark Mackey	f5e6a54f05	Refactored Execution Layer & Fixed Some Tests	2022-11-29 18:18:33 -06:00
Mark Mackey	36170ec428	Fixed some BeaconChain Tests	2022-11-29 18:18:18 -06:00
Mark Mackey	e0ea26c228	Remove withdrawals guard for PayloadAttributesV2	2022-11-29 18:03:29 -06:00
ethDreamer	342489a0c3	Fixed Payload Deserialization in DB (#3758 )	2022-11-30 10:27:13 +11:00
GeemoCandama	3534c85e30	Optimize finalized chain sync by skipping newPayload messages (#3738 ) ## Issue Addressed #3704 ## Proposed Changes Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status. ## Additional Info I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.	2022-11-29 08:19:27 +00:00
Paul Hauner	a2969ba7de	Improve debugging experience for builder proposals (#3725 ) ## Issue Addressed NA ## Proposed Changes This PR sets out to improve the logging/metrics experience when interacting with the builder. Namely, it: - Adds/changes metrics (see "Metrics Changes" section). - Adds new logs which show the duration of requests to the builder/local EL. - Refactors existing logs for consistency and so that the `parent_hash` is include in all relevant logs (we can grep for this field when trying to trace the flow of block production). Additionally, when I was implementing this PR I noticed that we skip some verification of the builder payload in the scenario where the builder return `Ok` but the local EL returns with `Err`. Namely, we were skipping the bid signature and other values like parent hash and prev randao. In this PR I've changed it so we always check these values and reject the bid if they're incorrect. With these changes, we'll sometimes choose to skip a proposal rather than propose something invalid -- that's the only side-effect to the changes that I can see. ## Metrics Changes - Changed: `execution_layer_request_times`: - `method = "get_blinded_payload_local"`: time taken to get a payload from a local EE. - `method = "get_blinded_payload_builder"`: time taken to get a blinded payload from a builder. - `method = "post_blinded_payload_builder"`: time taken to get a builder to reveal a payload they've previously supplied us. - `execution_layer_get_payload_outcome` - `outcome = "success"`: we successfully produced a payload from a builder or local EE. - `outcome = "failure"`: we were unable to get a payload from a builder or local EE. - New: `execution_layer_builder_reveal_payload_outcome` - `outcome = "success"`: a builder revealed a payload from a signed, blinded block. - `outcome = "failure"`: the builder did not reveal the payload. - New: `execution_layer_get_payload_source` - `type = "builder"`: we used a payload from a builder to produce a block. - `type = "local"`: we used a payload from a local EE to produce a block. - New: `execution_layer_get_payload_builder_rejections` has a `reason` field to describe why we rejected a payload from a builder. - New: `execution_layer_payload_bids` tracks the bid (in gwei) from the builder or local EE (local EE not yet supported, waiting on EEs to expose the value). Can only record values that fit inside an i64 (roughly 9 million ETH). ## Additional Info NA	2022-11-29 05:51:42 +00:00
Age Manning	2779017076	Gossipsub fast message id change (#3755 ) For improved consistency, this mixes in the topic into our fast message id for more consistent tracking of messages across topics.	2022-11-28 07:36:52 +00:00
Mac L	c881b80367	Add CLI flag for gui requirements (#3731 ) ## Issue Addressed #3723 ## Proposed Changes Adds a new CLI flag `--gui` which enables all the various flags required for the gui to function properly. Currently enables the `--http` and `--validator-monitor-auto` flags.	2022-11-28 00:22:53 +00:00
antondlr	e9bf7f7cc1	remove commas from comma-separated kv pairs (#3737 ) ## Issue Addressed Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing	2022-11-25 07:57:10 +00:00
Giulio rebuffo	d5a2de759b	Added LightClientBootstrap V1 (#3711 ) ## Issue Addressed Partially addresses #3651 ## Proposed Changes Adds server-side support for light_client_bootstrap_v1 topic ## Additional Info This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice. Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-11-25 05:19:00 +00:00
Michael Sproul	788b337951	Op pool and gossip for BLS to execution changes (#3726 )	2022-11-25 07:09:26 +11:00
realbigsean	58b54f0a53	Rename excess blobs and update 4844 json RPC serialization/deserialization (#3745 ) * rename excess blobs and fix json serialization/deserialization * remove coments	2022-11-24 16:41:35 +11:00
Michael Sproul	e3ccd8fd4a	Two Capella bugfixes (#3749 ) * Two Capella bugfixes * fix payload default check in fork choice * Revert "fix payload default check in fork choice" This reverts commit `e56fefbd05`. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-11-24 15:14:06 +11:00
Paul Hauner	bf533c8e42	v3.3.0 (#3741 ) ## Issue Addressed NA ## Proposed Changes - Bump versions - Pin the `nethermind` version since our method of getting the latest tags on `master` is giving us an old version (`1.14.1`). - Increase timeout for execution engine startup. ## Additional Info - [x] ~Awaiting further testing~	2022-11-23 23:38:32 +00:00
ethDreamer	28c9603505	Stuuupid camelCase (#3748 )	2022-11-23 14:42:58 +11:00
realbigsean	0228b2b42d	- fix pre-merge block production (#3746 ) - return `None` on pre-4844 blob requests	2022-11-22 17:10:40 -06:00
ethDreamer	24e5252a55	Massive Update to Engine API (#3740 ) * Massive Update to Engine API * Update beacon_node/execution_layer/src/engine_api/json_structures.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/execution_layer/src/engine_api/json_structures.rs Co-authored-by: Michael Sproul <micsproul@gmail.com> * Update beacon_node/beacon_chain/src/execution_payload.rs Co-authored-by: realbigsean <seananderson33@GMAIL.com> * Update beacon_node/execution_layer/src/engine_api.rs Co-authored-by: realbigsean <seananderson33@GMAIL.com> Co-authored-by: Michael Sproul <micsproul@gmail.com> Co-authored-by: realbigsean <seananderson33@GMAIL.com>	2022-11-22 13:27:48 -05:00
Michael Sproul	61b4bbf870	Fix BlocksByRoot response types (#3743 )	2022-11-22 12:29:47 -05:00
Michael Sproul	b477c42748	Lower deposit finalization error to warning (#3739 ) ## Issue Addressed Partially addresses #3707 ## Proposed Changes Drop `ERRO` log to `WARN` until we identify the exact conditions that lead to this case. Add a message which hopefully reassures users who only see this log once 😅 Add the block hash to the error message in case it will prove useful in debugging the root cause.	2022-11-21 06:29:03 +00:00
Akihito Nakano	8a36acdb1a	Super small improvement: Remove unnecessary `mut` (#3736 ) ## Issue Addressed <!--Which issue # does this PR address?--> Removed some unnecessary `mut`. 🙂 <!-- ## Proposed Changes Please list or describe the changes introduced by this PR. --> <!-- ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. -->	2022-11-21 03:15:54 +00:00
Pawan Dhananjay	857ef25d28	Add metrics for subnet queries (#3721 ) ## Issue Addressed N/A ## Proposed Changes Add metrics for peers discovered in subnet discv5 queries.	2022-11-15 13:25:38 +00:00
Michael Sproul	713b6a18d4	Simplify GossipTopic -> String conversion (#3722 ) ## Proposed Changes With a few different changes to the gossip topics in flight (light clients, Capella, 4844, etc) I think this simplification makes sense. I noticed it while plumbing through a new Capella topic.	2022-11-15 05:21:48 +00:00
Age Manning	230168deff	Health Endpoints for UI (#3668 ) This PR adds some health endpoints for the beacon node and the validator client. Specifically it adds the endpoint: `/lighthouse/ui/health` These are not entirely stable yet. But provide a base for modification for our UI. These also may have issues with various platforms and may need modification.	2022-11-15 05:21:26 +00:00
Michael Sproul	0cdd049da9	Fixes to make EF Capella tests pass (#3719 ) * Fixes to make EF Capella tests pass * Clippy for state_processing	2022-11-14 13:14:31 -06:00
Michael Sproul	3be41006a6	Add --light-client-server flag and state cache utils (#3714 ) ## Issue Addressed Part of https://github.com/sigp/lighthouse/issues/3651. ## Proposed Changes Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. https://github.com/sigp/lighthouse/pull/3693, https://github.com/sigp/lighthouse/pull/3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`. Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. https://github.com/sigp/lighthouse/pull/3206). ## Additional Info I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.	2022-11-11 11:03:18 +00:00
GeemoCandama	c591fcd201	add checkpoint-sync-url-timeout flag (#3710 ) ## Issue Addressed #3702 Which issue # does this PR address? #3702 ## Proposed Changes Added checkpoint-sync-url-timeout flag to cli. Added timeout field to ClientGenesis::CheckpointSyncUrl to utilize timeout set ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com> Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-11-11 00:38:28 +00:00
Michael Sproul	d99bfcf1a5	Blinded block and RANDAO APIs (#3571 ) ## Issue Addressed https://github.com/ethereum/beacon-APIs/pull/241 https://github.com/ethereum/beacon-APIs/pull/242 ## Proposed Changes Implement two new endpoints for fetching blinded blocks and RANDAO mixes. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-11-11 00:38:27 +00:00
Mark Mackey	81319dfcae	Forgot one feature guard	2022-11-10 15:33:26 -06:00
Mark Mackey	756e48f5dc	BeaconState field renamed	2022-11-10 11:49:55 -06:00
Mark Mackey	2191242341	Added stuff that NEEDS IMPLEMENTING	2022-11-09 19:35:01 -06:00
Mark Mackey	2d01ae6036	Fixed compiling with withdrawals enabled	2022-11-09 19:34:19 -06:00
tim gretler	266d765285	Register blocks in validator monitor (#3635 ) ## Issue Addressed Closes #3460 ## Proposed Changes `blocks` and `block_min_delay` are never updated in the epoch summary Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-11-09 05:37:09 +00:00
realbigsean	bc0af72c74	fix topic name	2022-11-07 12:36:31 -05:00
Divma	84c7d8cc70	Blocklookup data inconsistencies (#3677 ) ## Issue Addressed Closes #3649 ## Proposed Changes Add a regression test for the data inconsistency, catching the problem in `31e88c5533` [here](https://github.com/sigp/lighthouse/actions/runs/3379894044/jobs/5612044797#step:6:2043). When a chain is sent for processing, move it to a separate collection and now the test works, yay! ## Additional Info na	2022-11-07 06:48:34 +00:00
Paul Hauner	0655006e87	Clarify error log when registering validators (#3650 ) ## Issue Addressed NA ## Proposed Changes Adds clarification to an error log when there is an error submitting a validator registration. There seems to be a few cases where relays return errors during validator registration, including spurious timeouts and when a validator has been very recently activated/made pending. Changing this log helps indicate that it's "just another registration error" rather than something more serious. I didn't drop this to a `WARN` since I still have hope we can eliminate these errors completely by chatting with relays and adjusting timeouts. ## Additional Info NA	2022-11-07 06:48:31 +00:00
realbigsean	fc0b06a039	Feature gate withdrawals (#3684 ) * start feature gating * feature gate withdrawals	2022-11-04 16:50:26 -04:00
realbigsean	1aec17b09c	Merge branch 'unstable' of https://github.com/sigp/lighthouse into eip4844	2022-11-04 13:23:55 -04:00
Divma	8600645f65	Fix rust 1.65 lints (#3682 ) ## Issue Addressed New lints for rust 1.65 ## Proposed Changes Notable change is the identification or parameters that are only used in recursion ## Additional Info na	2022-11-04 07:43:43 +00:00
realbigsean	c45b809b76	Cleanup payload types (#3675 ) * Add transparent support * Add `Config` struct * Deprecate `enum_behaviour` * Partially remove enum_behaviour from project * Revert "Partially remove enum_behaviour from project" This reverts commit 46ffb7fe77622cf420f7ba2fccf432c0050535d6. * Revert "Deprecate `enum_behaviour`" This reverts commit 89b64a6f53d0f68685be88d5b60d39799d9933b5. * Add `struct_behaviour` * Tidy * Move tests into `ssz_derive` * Bump ssz derive * Fix comment * newtype transaparent ssz * use ssz transparent and create macros for per fork implementations * use superstruct map macros Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-11-02 10:30:41 -04:00
realbigsean	d8a49aad2b	merge with unstable fixes	2022-11-01 13:26:56 -04:00
realbigsean	8656d23327	merge with unstable	2022-11-01 13:18:00 -04:00
Pawan Dhananjay	29f2ec46d3	Couple blocks and blobs in gossip (#3670 ) * Revert "Add more gossip verification conditions" This reverts commit `1430b561c3`. * Revert "Add todos" This reverts commit `91efb9d4c7`. * Revert "Reprocess blob sidecar messages" This reverts commit `21bf3d37cd`. * Add the coupled topic * Decode SignedBeaconBlockAndBlobsSidecar correctly * Process Block and Blobs in beacon processor * Remove extra blob publishing logic from vc * Remove blob signing in vc * Ugly hack to compile	2022-11-01 10:28:21 -04:00
ethDreamer	e8604757a2	Deposit Cache Finalization & Fast WS Sync (#2915 ) ## Summary The deposit cache now has the ability to finalize deposits. This will cause it to drop unneeded deposit logs and hashes in the deposit Merkle tree that are no longer required to construct deposit proofs. The cache is finalized whenever the latest finalized checkpoint has a new `Eth1Data` with all deposits imported. This has three benefits: 1. Improves the speed of constructing Merkle proofs for deposits as we can just replay deposits since the last finalized checkpoint instead of all historical deposits when re-constructing the Merkle tree. 2. Significantly faster weak subjectivity sync as the deposit cache can be transferred to the newly syncing node in compressed form. The Merkle tree that stores `N` finalized deposits requires a maximum of `log2(N)` hashes. The newly syncing node then only needs to download deposits since the last finalized checkpoint to have a full tree. 3. Future proofing in preparation for [EIP-4444](https://eips.ethereum.org/EIPS/eip-4444) as execution nodes will no longer be required to store logs permanently so we won't always have all historical logs available to us. ## More Details Image to illustrate how the deposit contract merkle tree evolves and finalizes along with the resulting `DepositTreeSnapshot` ![image](https://user-images.githubusercontent.com/37123614/151465302-5fc56284-8a69-4998-b20e-45db3934ac70.png) ## Other Considerations I've changed the structure of the `SszDepositCache` so once you load & save your database from this version of lighthouse, you will no longer be able to load it from older versions. Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>	2022-10-30 04:04:24 +00:00
Divma	46fbf5b98b	Update discv5 (#3171 ) ## Issue Addressed Updates discv5 Pending on - [x] #3547 - [x] Alex upgrades his deps ## Proposed Changes updates discv5 and the enr crate. The only relevant change would be some clear indications of ipv4 usage in lighthouse ## Additional Info Functionally, this should be equivalent to the prev version. As draft pending a discv5 release	2022-10-28 05:40:06 +00:00
ethDreamer	f1a3b3b01c	Added Capella Epoch Processing Logic (#3666 )	2022-10-27 17:41:39 -04:00
realbigsean	137f230344	Capella eip 4844 cleanup (#3652 ) * add capella gossip boiler plate * get everything compiling Co-authored-by: realbigsean <sean@sigmaprime.io Co-authored-by: Mark Mackey <mark@sigmaprime.io> * small cleanup * small cleanup * cargo fix + some test cleanup * improve block production * add fixme for potential panic Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-10-26 15:15:26 -04:00
Michael Sproul	6d5a2b509f	Release v3.2.1 (#3660 ) ## Proposed Changes Patch release to include the performance regression fix https://github.com/sigp/lighthouse/pull/3658. ## Additional Info ~~Blocked on the merge of https://github.com/sigp/lighthouse/pull/3658.~~	2022-10-26 09:38:25 +00:00
Michael Sproul	77eabc5401	Revert "Optimise HTTP validator lookups" (#3658 ) ## Issue Addressed This reverts commit `ca9dc8e094` (PR #3559) with some modifications. ## Proposed Changes Unfortunately that PR introduced a performance regression in fork choice. The optimisation _intended_ to build the exit and pubkey caches on the head state _only if_ they were not already built. However, due to the head state always being cloned without these caches, we ended up building them every time the head changed, leading to a ~70ms+ penalty on mainnet. `fcfd02aeec/beacon_node/beacon_chain/src/canonical_head.rs (L633-L636)` I believe this is a severe enough regression to justify immediately releasing v3.2.1 with this change. ## Additional Info I didn't fully revert #3559, because there were some unrelated deletions of dead code in that PR which I figured we may as well keep. An alternative would be to clone the extra caches, but this likely still imposes some cost, so in the interest of applying a conservative fix quickly, I think reversion is the best approach. The optimisation from #3559 was not even optimising a particularly significant path, it was mostly for VCs running larger numbers of inactive keys. We can re-do it in the `tree-states` world where cache clones are cheap.	2022-10-26 06:50:04 +00:00
Paul Hauner	fcfd02aeec	Release v3.2.0 (#3647 ) ## Issue Addressed NA ## Proposed Changes Bump version to `v3.2.0` ## Additional Info - ~~Blocked on #3597~~ - ~~Blocked on #3645~~ - ~~Blocked on #3653~~ - ~~Requires additional testing~~	2022-10-25 06:36:51 +00:00
Divma	3a5888e53d	Ban and unban peers at the swarm level (#3653 ) ## Issue Addressed I missed this from https://github.com/sigp/lighthouse/pull/3491. peers were being banned at the behaviour level only. The identify errors are explained by this as well ## Proposed Changes Add banning and unbanning ## Additional Info Befor,e having tests that catch this was hard because the swarm was outside the behaviour. We could now have tests that prevent something like this in the future	2022-10-24 21:39:30 +00:00
pinkiebell	d0efb6b18a	beacon_node: add --disable-deposit-contract-sync flag (#3597 ) Overrides any previous option that enables the eth1 service. Useful for operating a `light` beacon node. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-10-19 22:55:49 +00:00
GeemoCandama	c5cd0d9b3f	add execution-timeout-multiplier flag to optionally increase timeouts (#3631 ) ## Issue Addressed Add flag to lengthen execution layer timeouts Which issue # does this PR address? #3607 ## Proposed Changes Added execution-timeout-multiplier flag and a cli test to ensure the execution layer config has the multiplier set correctly. Please list or describe the changes introduced by this PR. Add execution_timeout_multiplier to the execution layer config as Option<u32> and pass the u32 to HttpJsonRpc. ## Additional Info Not certain that this is the best way to implement it so I'd appreciate any feedback. Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-10-18 04:02:07 +00:00
Michael Sproul	edf23bb40e	Fix attestation shuffling filter (#3629 ) ## Issue Addressed Fix a bug in block production that results in blocks with 0 attestations during the first slot of an epoch. The bug is marked by debug logs of the form: > DEBG Discarding attestation because of missing ancestor, block_root: 0x3cc00d9c9e0883b2d0db8606278f2b8423d4902f9a1ee619258b5b60590e64f8, pivot_slot: 4042591 It occurs when trying to look up the shuffling decision root for an attestation from a slot which is prior to fork choice's finalized block. This happens frequently when proposing in the first slot of the epoch where we have: - `current_epoch == n` - `attestation.data.target.epoch == n - 1` - attestation shuffling epoch `== n - 3` (decision block being the last block of `n - 3`) - `state.finalized_checkpoint.epoch == n - 2` (first block of `n - 2` is finalized) Hence the shuffling decision slot is out of range of the fork choice backwards iterator _by a single slot_. Unfortunately this bug was hidden when we weren't pruning fork choice, and then reintroduced in v2.5.1 when we fixed the pruning (https://github.com/sigp/lighthouse/releases/tag/v2.5.1). There's no way to turn that off or disable the filtering in our current release, so we need a new release to fix this issue. Fortunately, it also does not occur on every epoch boundary because of the gradual pruning of fork choice every 256 blocks (~8 epochs): `01e84b71f5/consensus/proto_array/src/proto_array_fork_choice.rs (L16)` `01e84b71f5/consensus/proto_array/src/proto_array.rs (L713-L716)` So the probability of proposing a 0-attestation block given a proposal assignment is approximately `1/32 * 1/8 = 0.39%`. ## Proposed Changes - Load the block's shuffling ID from fork choice and verify it against the expected shuffling ID of the head state. This code was initially written before we had settled on a representation of shuffling IDs, so I think it's a nice simplification to make use of them here rather than more ad-hoc logic that fundamentally does the same thing. ## Additional Info Thanks to @moshe-blox for noticing this issue and bringing it to our attention.	2022-10-18 04:02:06 +00:00
Michael Sproul	59ec6b71b8	Consensus context with proposer index caching (#3604 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2371 ## Proposed Changes Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`. With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required. ## Additional Info In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards. There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).	2022-10-15 22:25:54 +00:00
Michael Sproul	e4cbdc1c77	Optimistic sync spec tests (v1.2.0) (#3564 ) ## Issue Addressed Implements new optimistic sync test format from https://github.com/ethereum/consensus-specs/pull/2982. ## Proposed Changes - Add parsing and runner support for the new test format. - Extend the mock EL with a set of canned responses keyed by block hash. Although this doubles up on some of the existing functionality I think it's really nice to use compared to the `preloaded_responses` or static responses. I think we could write novel new opt sync tests using these primtives much more easily than the previous ones. Forks are natively supported, and different responses to `forkchoiceUpdated` and `newPayload` are also straight-forward. ## Additional Info Blocked on merge of the spec PR and release of new test vectors.	2022-10-15 22:25:52 +00:00
Michael Sproul	ca9dc8e094	Optimise HTTP validator lookups (#3559 ) ## Issue Addressed While digging around in some logs I noticed that queries for validators by pubkey were taking 10ms+, which seemed too long. This was due to a loop through the entire validator registry for each lookup. ## Proposed Changes Rather than using a loop through the register, this PR utilises the pubkey cache which is usually initialised at the head. In case the cache isn't built, we fall back to the previous loop logic. In the vast majority of cases I expect the cache will be built, as the validator client queries at the `head` where all caches should be built. ## Additional Info I had to modify the cache build that runs after fork choice to build the pubkey cache. I think it had been optimised out, perhaps accidentally. I think it's preferable to have the exit cache and the pubkey cache built on the head state, as they are required for verifying deposits and exits respectively, and we may as well build them off the hot path of block processing. Previously they'd get built the first time a deposit or exit needed to be verified. I've deleted the unused `map_state` function which was obsoleted by `map_state_and_execution_optimistic`.	2022-10-15 22:25:51 +00:00
ethDreamer	221c433d62	Fixed a ton of state_processing stuff (#3642 ) FIXME's: * consensus/fork_choice/src/fork_choice.rs * consensus/state_processing/src/per_epoch_processing/capella.rs * consensus/types/src/execution_payload_header.rs TODO's: * consensus/state_processing/src/per_epoch_processing/capella/partial_withdrawals.rs * consensus/state_processing/src/per_epoch_processing/capella/full_withdrawals.rs	2022-10-14 17:35:10 -05:00
ethDreamer	255fdf0724	Added Capella Data Structures to consensus/types (#3637 ) * Ran Cargo fmt * Added Capella Data Structures to consensus/types	2022-10-13 09:37:20 -05:00
Pawan Dhananjay	1430b561c3	Add more gossip verification conditions	2022-10-06 21:16:59 -05:00
realbigsean	44515b8cbe	cargo fix	2022-10-05 17:20:54 -04:00
realbigsean	b5b4ce9509	blob production	2022-10-05 17:14:45 -04:00
Pawan Dhananjay	91efb9d4c7	Add todos	2022-10-05 02:56:57 -05:00
Pawan Dhananjay	21bf3d37cd	Reprocess blob sidecar messages	2022-10-05 02:52:26 -05:00
Pawan Dhananjay	c55b28bf10	Minor fixes	2022-10-04 19:18:06 -05:00
Pawan Dhananjay	12fe514550	Add more gossip verification functions for blobs	2022-10-04 19:17:53 -05:00
Pawan Dhananjay	9d99c784ea	Add gossip verification stub	2022-10-04 17:54:14 -05:00
realbigsean	7527c2b455	fix RPC limit add blob signing domain	2022-10-04 14:57:29 -04:00
realbigsean	ba16a037a3	cleanup	2022-10-04 09:34:05 -04:00
mariuspod	242ae21e5d	Pass EL JWT secret key via cli flag (#3568 ) ## Proposed Changes In this change I've added a new beacon_node cli flag `--execution-jwt-secret-key` for passing the JWT secret directly as string. Without this flag, it was non-trivial to pass a secrets file containing a JWT secret key without compromising its contents into some management repo or fiddling around with manual file mounts for cloud-based deployments. When used in combination with environment variables, the secret can be injected into container-based systems like docker & friends quite easily. It's both possible to either specify the file_path to the JWT secret or pass the JWT secret directly. I've modified the docs and attached a test as well. ## Additional Info The logic has been adapted a bit so that either one of `--execution-jwt` or `--execution-jwt-secret-key` must be set when specifying `--execution-endpoint` so that it's still compatible with the semantics before this change and there's at least one secret provided.	2022-10-04 12:41:03 +00:00
realbigsean	c0dc42ea07	cargo fmt	2022-10-04 08:21:46 -04:00
Divma	4926e3967f	[DEV FEATURE] Deterministic long lived subnets (#3453 ) ## Issue Addressed #2847 ## Proposed Changes Add under a feature flag the required changes to subscribe to long lived subnets in a deterministic way ## Additional Info There is an additional required change that is actually searching for peers using the prefix, but I find that it's best to make this change in the future	2022-10-04 10:37:48 +00:00
GeemoCandama	6a92bf70e4	CLI tests for logging flags (#3609 ) ## Issue Addressed Adding CLI tests for logging flags: log-color and disable-log-timestamp Which issue # does this PR address? #3588 ## Proposed Changes Add CLI tests for logging flags as described in #3588 Please list or describe the changes introduced by this PR. Added logger_config to client::Config as suggested. Implemented Default for LoggerConfig based on what was being done elsewhere in the repo. Created 2 tests for each flag addressed. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-10-04 08:33:40 +00:00
Pawan Dhananjay	8728c40102	Remove fallback support from eth1 service (#3594 ) ## Issue Addressed N/A ## Proposed Changes With https://github.com/sigp/lighthouse/pull/3214 we made it such that you can either have 1 auth endpoint or multiple non auth endpoints. Now that we are post merge on all networks (testnets and mainnet), we cannot progress a chain without a dedicated auth execution layer connection so there is no point in having a non-auth eth1-endpoint for syncing deposit cache. This code removes all fallback related code in the eth1 service. We still keep the single non-auth endpoint since it's useful for testing. ## Additional Info This removes all eth1 fallback related metrics that were relevant for the monitoring service, so we might need to change the api upstream.	2022-10-04 08:33:39 +00:00
realbigsean	8d45e48775	cargo fix	2022-10-03 21:52:16 -04:00
realbigsean	e81dbbfea4	compile	2022-10-03 21:48:02 -04:00
realbigsean	88006735c4	compile	2022-10-03 10:06:04 -04:00
realbigsean	7520651515	cargo fix and some test fixes	2022-09-29 12:43:35 -04:00
realbigsean	fe6fc55449	fix compilation errors, rename capella -> shanghai, cleanup some rebase issues	2022-09-29 12:43:13 -04:00
realbigsean	809b52715e	some block building updates	2022-09-29 12:38:00 -04:00
realbigsean	acaa340b41	add new beacon state variant for shanghai	2022-09-29 12:37:14 -04:00
realbigsean	203418ffc9	add `engine_getBlobV1`	2022-09-29 12:35:55 -04:00
realbigsean	3f1e5cee78	Some gossip work	2022-09-29 12:35:53 -04:00
realbigsean	ebc0ccd02a	some more sync boilerplate	2022-09-29 12:34:09 -04:00
realbigsean	4008da6c60	sync tx blobs	2022-09-29 12:32:55 -04:00
realbigsean	4cdf1b546d	add shanghai fork version and epoch	2022-09-29 12:28:58 -04:00
realbigsean	de44b300c0	add/update types	2022-09-29 12:25:56 -04:00
Age Manning	27bb9ff07d	Handle Lodestar's new agent string (#3620 ) ## Issue Addressed #3561 ## Proposed Changes Recognize Lodestars new agent string and appropriately count these peers as lodestar peers.	2022-09-29 01:50:13 +00:00
Age Manning	01b6bf7a2d	Improve logging a little (#3619 ) Some of the logs in combination with others could be improved. It will save some time debugging by improving the wording slightly.	2022-09-29 01:50:12 +00:00
Divma	b1d2510d1b	Libp2p v0.48.0 upgrade (#3547 ) ## Issue Addressed Upgrades libp2p to v.0.47.0. This is the compilation of - [x] #3495 - [x] #3497 - [x] #3491 - [x] #3546 - [x] #3553 Co-authored-by: Age Manning <Age@AgeManning.com>	2022-09-29 01:50:11 +00:00
Paul Hauner	01e84b71f5	v3.1.2 (#3603 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v3.1.2 ## Additional Info - ~~Blocked on several PRs.~~ - ~~Requires further testing.~~	2022-09-26 01:17:36 +00:00
Divma	bd873e7162	New rust lints for rustc 1.64.0 (#3602 ) ## Issue Addressed fixes lints from the last rust release ## Proposed Changes Fix the lints, most of the lints by `clippy::question-mark` are false positives in the form of https://github.com/rust-lang/rust-clippy/issues/9518 so it's allowed for now ## Additional Info	2022-09-23 03:52:46 +00:00
Divma	9bd384a573	send attnet unsubscription event on random subnet expiry (#3600 ) ## Issue Addressed 🐞 in which we don't actually unsubscribe from a random long lived subnet when it expires ## Proposed Changes Remove code addressing a specific case in which we are subscribed to all subnets and handle the removal of the long lived subnet. I don't think the special case code is particularly important as, if someone is running with that many validators to be subscribed to all subnets, it should use `--subscribe-all-subnets` instead ## Additional Info Noticed on some test nodes climbing bandwidth usage periodically (around 27hours, the time of subnet expirations) I'm running this code to test this does not happen anymore, but I think it should be good now	2022-09-23 03:52:45 +00:00
Paul Hauner	9246a92d76	Make garbage collection test less failure prone (#3599 ) ## Issue Addressed NA ## Proposed Changes This PR attempts to fix the following spurious CI failure: ``` ---- store_tests::garbage_collect_temp_states_from_failed_block stdout ---- thread 'store_tests::garbage_collect_temp_states_from_failed_block' panicked at 'disk store should initialize: DBError { message: "Error { message: \"IO error: lock /tmp/.tmp6DcBQ9/cold_db/LOCK: already held by process\" }" }', beacon_node/beacon_chain/tests/store_tests.rs:59:10 ``` I believe that some async task is taking a clone of the store and holding it in some other thread for a short time. This creates a race-condition when we try to open a new instance of the store. ## Additional Info NA	2022-09-23 03:52:44 +00:00
Paul Hauner	fa6ad1a11a	Deduplicate block root computation (#3590 ) ## Issue Addressed NA ## Proposed Changes This PR removes duplicated block root computation. Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`. Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations after the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times. ## Additional Info NA	2022-09-23 03:52:42 +00:00
Paul Hauner	3128b5b430	v3.1.1 (#3585 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info - ~~Requires additional testing~~ - ~~Blocked on:~~ - ~~#3589~~ - ~~#3540~~ - ~~#3587~~	2022-09-22 06:08:52 +00:00
Paul Hauner	96692b8e43	Impl `oneshot_broadcast` for committee promises (#3595 ) ## Issue Addressed NA ## Proposed Changes Fixes an issue introduced in #3574 where I erroneously assumed that a `crossbeam_channel` multiple receiver queue was a broadcast queue. This is incorrect, each message will be received by only one receiver. The effect of this mistake is these logs: ``` Sep 20 06:56:17.001 INFO Synced slot: 4736079, block: 0xaa8a…180d, epoch: 148002, finalized_epoch: 148000, finalized_root: 0x2775…47f2, exec_hash: 0x2ca5…ffde (verified), peers: 6, service: slot_notifier Sep 20 06:56:23.237 ERRO Unable to validate attestation error: CommitteeCacheWait(RecvError), peer_id: 16Uiu2HAm2Jnnj8868tb7hCta1rmkXUf5YjqUH1YPj35DCwNyeEzs, type: "aggregated", slot: Slot(4736047), beacon_block_root: 0x88d318534b1010e0ebd79aed60b6b6da1d70357d72b271c01adf55c2b46206c1 ``` ## Additional Info NA	2022-09-21 01:01:50 +00:00
Paul Hauner	a95bcba2ab	Avoid holding write-lock whilst waiting on shuffling cache promise (#3589 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug which hogged the write-lock for the `shuffling_cache`. ## Additional Info NA	2022-09-19 07:58:50 +00:00
Michael Sproul	507bb9dad4	Refined payload pruning (#3587 ) ## Proposed Changes Improve the payload pruning feature in several ways: - Payload pruning is now entirely optional. It is enabled by default but can be disabled with `--prune-payloads false`. The previous `--prune-payloads-on-startup` flag from #3565 is removed. - Initial payload pruning on startup now runs in a background thread. This thread will always load the split state, which is a small fraction of its total work (up to ~300ms) and then backtrack from that state. This pruning process ran in 2m5s on one Prater node with good I/O and 16m on a node with slower I/O. - To work with the optional payload pruning the database function `try_load_full_block` will now attempt to load execution payloads for finalized slots _if_ pruning is currently disabled. This gives users an opt-out for the extensive traffic between the CL and EL for reconstructing payloads. ## Additional Info If the `prune-payloads` flag is toggled on and off then the on-startup check may not see any payloads to delete and fail to clean them up. In this case the `lighthouse db prune_payloads` command should be used to force a manual sweep of the database.	2022-09-19 07:58:49 +00:00
Michael Sproul	f2ac0738d8	Implement `skip_randao_verification` and blinded block rewards API (#3540 ) ## Issue Addressed https://github.com/ethereum/beacon-APIs/pull/222 ## Proposed Changes Update Lighthouse's randao verification API to match the `beacon-APIs` spec. We implemented the API before spec stabilisation, and it changed slightly in the course of review. Rather than a flag `verify_randao` taking a boolean value, the new API uses a `skip_randao_verification` flag which takes no argument. The new spec also requires the randao reveal to be present and equal to the point-at-infinity when `skip_randao_verification` is set. I've also updated the `POST /lighthouse/analysis/block_rewards` API to take blinded blocks as input, as the execution payload is irrelevant and we may want to assess blocks produced by builders. ## Additional Info This is technically a breaking change, but seeing as I suspect I'm the only one using these parameters/APIs, I think we're OK to include this in a patch release.	2022-09-19 07:58:48 +00:00
Marius van der Wijden	6f7d21c542	enable 4844 at epoch 3	2022-09-18 12:13:03 +02:00
Marius van der Wijden	285dbf43ed	hacky hacks	2022-09-18 11:34:46 +02:00
Marius van der Wijden	8b71b978e0	new round of hacks (config etc)	2022-09-17 23:42:49 +02:00
Daniel Knopik	750c594f5f	forgor something	2022-09-17 21:38:57 +02:00
Daniel Knopik	eab1fce0e5	Merge branch 'eip4844' of github.com:dknopik/lighthouse into eip4844	2022-09-17 20:55:36 +02:00
Daniel Knopik	76572db9d5	add network config	2022-09-17 20:55:21 +02:00
Marius van der Wijden	f43532d3de	implement handle blobs by range req	2022-09-17 20:05:51 +02:00
Marius van der Wijden	f9209e2d08	more network stuff	2022-09-17 16:39:40 +02:00
Marius van der Wijden	aeb52ff186	network stuff	2022-09-17 16:10:42 +02:00
Daniel Knopik	d4d40be870	storable blobs	2022-09-17 15:58:52 +02:00
Marius van der Wijden	36a0add0cd	network stuff	2022-09-17 15:23:28 +02:00
Daniel Knopik	0518665949	Merge remote-tracking branch 'fork/eip4844' into eip4844	2022-09-17 14:58:33 +02:00
Daniel Knopik	292a16a6eb	gossip boilerplate	2022-09-17 14:58:27 +02:00
Marius van der Wijden	acace8ab31	network: blobs by range message	2022-09-17 14:55:18 +02:00
Daniel Knopik	bcc738cb9d	progress on gossip stuff	2022-09-17 14:31:57 +02:00
Marius van der Wijden	8473f08d10	beacon: consensus: implement engine api getBlobs	2022-09-17 14:10:15 +02:00
Daniel Knopik	dcfae6c5cf	implement From<FullPayload> for Payload	2022-09-17 13:29:20 +02:00
Marius van der Wijden	fe6be28e6b	beacon: consensus: implement engine api getBlobs	2022-09-17 13:20:18 +02:00
Daniel Knopik	ca1e17b386	it compiles!	2022-09-17 12:23:03 +02:00
Daniel Knopik	95203c51d4	fix some bugx, adjust stucts	2022-09-17 11:26:18 +02:00
Michael Sproul	ca42ef2e5a	Prune finalized execution payloads (#3565 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/3556 ## Proposed Changes Delete finalized execution payloads from the database in two places: 1. When running the finalization migration in `migrate_database`. We delete the finalized payloads between the last split point and the new updated split point. _If_ payloads are already pruned prior to this then this is sufficient to prune _all_ payloads as non-canonical payloads are already deleted by the head pruner, and all canonical payloads prior to the previous split will already have been pruned. 2. To address the fact that users will update to this code _after_ the merge on mainnet (and testnets), we need a one-off scan to delete the finalized payloads from the canonical chain. This is implemented in `try_prune_execution_payloads` which runs on startup and scans the chain back to the Bellatrix fork or the anchor slot (if checkpoint synced after Bellatrix). In the case where payloads are already pruned this check only imposes a single state load for the split state, which shouldn't be _too slow_. Even so, a flag `--prepare-payloads-on-startup=false` is provided to turn this off after it has run the first time, which provides faster start-up times. There is also a new `lighthouse db prune_payloads` subcommand for users who prefer to run the pruning manually. ## Additional Info The tests have been updated to not rely on finalized payloads in the database, instead using the `MockExecutionLayer` to reconstruct them. Additionally a check was added to `check_chain_dump` which asserts the non-existence or existence of payloads on disk depending on their slot.	2022-09-17 02:27:01 +00:00
Paul Hauner	2cd3e3a768	Avoid duplicate committee cache loads (#3574 ) ## Issue Addressed NA ## Proposed Changes I have observed scenarios on Goerli where Lighthouse was receiving attestations which reference the same, un-cached shuffling on multiple threads at the same time. Lighthouse was then loading the same state from database and determining the shuffling on multiple threads at the same time. This is unnecessary load on the disk and RAM. This PR modifies the shuffling cache so that each entry can be either: - A committee - A promise for a committee (i.e., a `crossbeam_channel::Receiver`) Now, in the scenario where we have thread A and thread B simultaneously requesting the same un-cached shuffling, we will have the following: 1. Thread A will take the write-lock on the shuffling cache, find that there's no cached committee and then create a "promise" (a `crossbeam_channel::Sender`) for a committee before dropping the write-lock. 1. Thread B will then be allowed to take the write-lock for the shuffling cache and find the promise created by thread A. It will block the current thread waiting for thread A to fulfill that promise. 1. Thread A will load the state from disk, obtain the shuffling, send it down the channel, insert the entry into the cache and then continue to verify the attestation. 1. Thread B will then receive the shuffling from the receiver, be un-blocked and then continue to verify the attestation. In the case where thread A fails to generate the shuffling and drops the sender, the next time that specific shuffling is requested we will detect that the channel is disconnected and return a `None` entry for that shuffling. This will cause the shuffling to be re-calculated. ## Additional Info NA	2022-09-16 08:54:03 +00:00
Paul Hauner	7d3948c8fe	Add metric for re-org distance (#3566 ) ## Issue Addressed NA ## Proposed Changes Add a metric to track the re-org distance. ## Additional Info NA	2022-09-13 17:19:27 +00:00
tim gretler	98815516a1	Support histogram buckets (#3391 ) ## Issue Addressed #3285 ## Proposed Changes Adds support for specifying histogram with buckets and adds new metric buckets for metrics mentioned in issue. ## Additional Info Need some help for the buckets. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-09-13 01:57:44 +00:00
Nils Effinghausen	f682df51a1	fix description for BALANCES_CACHE_MISSES metric (#3545 ) ## Issue Addressed fixes metric description Co-authored-by: Nils Effinghausen <nils.effinghausen@t-systems.com>	2022-09-10 01:35:10 +00:00
realbigsean	d1a8d6cf91	Pin mev rs deps (#3557 ) ## Issue Addressed We were unable to update lighthouse by running `cargo update` because some of the `mev-build-rs` deps weren't pinned. But `mev-build-rs` is now pinned here and includes it's own pinned commits for `ssz-rs` and `etheruem-consensus` Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-08 23:46:03 +00:00
Michael Sproul	9a7f7f1c1e	Configurable monitoring endpoint frequency (#3530 ) ## Issue Addressed Closes #3514 ## Proposed Changes - Change default monitoring endpoint frequency to 120 seconds to fit with 30k requests/month limit. - Allow configuration of the monitoring endpoint frequency using `--monitoring-endpoint-frequency N` where `N` is a value in seconds.	2022-09-05 08:29:00 +00:00
realbigsean	177aef8f1e	Builder profit threshold flag (#3534 ) ## Issue Addressed Resolves https://github.com/sigp/lighthouse/issues/3517 ## Proposed Changes Adds a `--builder-profit-threshold <wei value>` flag to the BN. If an external payload's value field is less than this value, the local payload will be used. The value of the local payload will not be checked (it can't really be checked until the engine API is updated to support this). Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-05 04:50:49 +00:00
realbigsean	cae40731a2	Strict count unrealized (#3522 ) ## Issue Addressed Add a flag that can increase count unrealized strictness, defaults to false ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: sean <seananderson33@gmail.com>	2022-09-05 04:50:47 +00:00
Mac L	80359d8ddb	Fix attestation performance API `InvalidValidatorIndex` error (#3503 ) ## Issue Addressed When requesting an index which is not active during `start_epoch`, Lighthouse returns: ``` curl "http://localhost:5052/lighthouse/analysis/attestation_performance/999999999?start_epoch=100000&end_epoch=100000" ``` ```json { "code": 500, "message": "INTERNAL_SERVER_ERROR: ParticipationCache(InvalidValidatorIndex(999999999))", "stacktraces": [] } ``` This error occurs even when the index in question becomes active before `end_epoch` which is undesirable as it can prevent larger queries from completing. ## Proposed Changes In the event the index is out-of-bounds (has not yet been activated), simply return all fields as `false`: ``` -> curl "http://localhost:5052/lighthouse/analysis/attestation_performance/999999999?start_epoch=100000&end_epoch=100000" ``` ```json [ { "index": 999999999, "epochs": { "100000": { "active": false, "head": false, "target": false, "source": false } } } ] ``` By doing this, we cover the case where a validator becomes active sometime between `start_epoch` and `end_epoch`. ## Additional Info Note that this error only occurs for epochs after the Altair hard fork.	2022-09-05 04:50:45 +00:00
Divma	473abc14ca	Subscribe to subnets only when needed (#3419 ) ## Issue Addressed We currently subscribe to attestation subnets as soon as the subscription arrives (one epoch in advance), this makes it so that subscriptions for future slots are scheduled instead of done immediately. ## Proposed Changes - Schedule subscriptions to subnets for future slots. - Finish removing hashmap_delay, in favor of [delay_map](https://github.com/AgeManning/delay_map). This was the only remaining service to do this. - Subscriptions for past slots are rejected, before we would subscribe for one slot. - Add a new test for subscriptions that are not consecutive. ## Additional Info This is also an effort in making the code easier to understand	2022-09-05 00:22:48 +00:00
Paul Hauner	aa022f4685	v3.1.0 (#3525 ) ## Issue Addressed NA ## Proposed Changes - Bump versions ## Additional Info - ~~Blocked on #3508~~ - ~~Blocked on #3526~~ - ~~Requires additional testing.~~ - Expected release date is 2022-09-01	2022-08-31 22:21:55 +00:00
Paul Hauner	661307dce1	Separate committee subscriptions queue (#3508 ) ## Issue Addressed NA ## Proposed Changes As we've seen on Prater, there seems to be a correlation between these messages ``` WARN Not enough time for a discovery search subnet_id: ExactSubnet { subnet_id: SubnetId(19), slot: Slot(3742336) }, service: attestation_service ``` ... and nodes falling 20-30 slots behind the head for short periods. These nodes are running ~20k Prater validators. After running some metrics, I can see that the `network_recv` channel is processing ~250k `AttestationSubscribe` messages per minute. It occurred to me that perhaps the `AttestationSubscribe` messages are "washing out" the `SendRequest` and `SendResponse` messages. In this PR I separate the `AttestationSubscribe` and `SyncCommitteeSubscribe` messages into their own queue so the `tokio::select!` in the `NetworkService` can still process the other messages in the `network_recv` channel without necessarily having to clear all the subscription messages first. ~~I've also added filter to the HTTP API to prevent duplicate subscriptions going to the network service.~~ ## Additional Info - Currently being tested on Prater	2022-08-30 05:47:31 +00:00
Michael Sproul	7a50684741	Harden slot notifier against clock drift (#3519 ) ## Issue Addressed Partly resolves #3518 ## Proposed Changes Change the slot notifier to use `duration_to_next_slot` rather than an interval timer. This makes it robust against underlying clock changes.	2022-08-29 14:34:43 +00:00
Paul Hauner	1a833ecc17	Add more logging for invalid payloads (#3515 ) ## Issue Addressed NA ## Proposed Changes Adds more `debug` logging to help troubleshoot invalid execution payload blocks. I was doing some of this recently and found it to be challenging. With this PR we should be able to grep `Invalid execution payload` and get one-liners that will show the block, slot and details about the proposer. I also changed the log in `process_invalid_execution_payload` since it was a little misleading; the `block_root` wasn't necessary the block which had an invalid payload. ## Additional Info NA	2022-08-29 14:34:42 +00:00
Paul Hauner	8609cced0e	Reset payload statuses when resuming fork choice (#3498 ) ## Issue Addressed NA ## Proposed Changes This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot. The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around. ## Additional Info This implementation has the following intricacies: 1. Instead of just resetting invalid payloads to optimistic, we'll also reset valid payloads. This is an artifact of our existing implementation. 1. We will only reset payload statuses when we detect an invalid payload present in `proto_array` - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks. 1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that does not have reverted payload statuses. - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.	2022-08-29 14:34:41 +00:00
Michael Sproul	66eca1a882	Refactor op pool for speed and correctness (#3312 ) ## Proposed Changes This PR has two aims: to speed up attestation packing in the op pool, and to fix bugs in the verification of attester slashings, proposer slashings and voluntary exits. The changes are bundled into a single database schema upgrade (v12). Attestation packing is sped up by removing several inefficiencies: - No more recalculation of `attesting_indices` during packing. - No (unnecessary) examination of the `ParticipationFlags`: a bitfield suffices. See `RewardCache`. - No re-checking of attestation validity during packing: the `AttestationMap` provides attestations which are "correct by construction" (I have checked this using Hydra). - No SSZ re-serialization for the clunky `AttestationId` type (it can be removed in a future release). So far the speed-up seems to be roughly 2-10x, from 500ms down to 50-100ms. Verification of attester slashings, proposer slashings and voluntary exits is fixed by: - Tracking the `ForkVersion`s that were used to verify each message inside the `SigVerifiedOp`. This allows us to quickly re-verify that they match the head state's opinion of what the `ForkVersion` should be at the epoch(s) relevant to the message. - Storing the `SigVerifiedOp` on disk rather than the raw operation. This allows us to continue track the fork versions after a reboot. This is mostly contained in this commit 52bb1840ae5c4356a8fc3a51e5df23ed65ed2c7f. ## Additional Info The schema upgrade uses the justified state to re-verify attestations and compute `attesting_indices` for them. It will drop any attestations that fail to verify, by the logic that attestations are most valuable in the few slots after they're observed, and are probably stale and useless by the time a node restarts. Exits and proposer slashings and similarly re-verified to obtain `SigVerifiedOp`s. This PR contains a runtime killswitch `--paranoid-block-proposal` which opts out of all the optimisations in favour of closely verifying every included message. Although I'm quite sure that the optimisations are correct this flag could be useful in the event of an unforeseen emergency. Finally, you might notice that the `RewardCache` appears quite useless in its current form because it is only updated on the hot-path immediately before proposal. My hope is that in future we can shift calls to `RewardCache::update` into the background, e.g. while performing the state advance. It is also forward-looking to `tree-states` compatibility, where iterating and indexing `state.{previous,current}_epoch_participation` is expensive and needs to be minimised.	2022-08-29 09:10:26 +00:00
realbigsean	cb132c622d	don't register exited or slashed validators with the builder api (#3473 ) ## Issue Addressed #3465 ## Proposed Changes Filter out any validator registrations for validators that are not `active` or `pending`. I'm adding this filtering the beacon node because all the information is readily available there. In other parts of the VC we are usually sending per-validator requests based on duties from the BN. And duties will only be provided for active validators so we don't have this type of filtering elsewhere in the VC. Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-08-24 23:34:58 +00:00
Divma	8c69d57c2c	Pause sync when EE is offline (#3428 ) ## Issue Addressed #3032 ## Proposed Changes Pause sync when ee is offline. Changes include three main parts: - Online/offline notification system - Pause sync - Resume sync #### Online/offline notification system - The engine state is now guarded behind a new struct `State` that ensures every change is correctly notified. Notifications are only sent if the state changes. The new `State` is behind a `RwLock` (as before) as the synchronization mechanism. - The actual notification channel is a [tokio::sync::watch](https://docs.rs/tokio/latest/tokio/sync/watch/index.html) which ensures only the last value is in the receiver channel. This way we don't need to worry about message order etc. - Sync waits for state changes concurrently with normal messages. #### Pause Sync Sync has four components, pausing is done differently in each: - Block lookups: Disabled while in this state. We drop current requests and don't search for new blocks. Block lookups are infrequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it. - Parent lookups: Disabled while in this state. We drop current requests and don't search for new parents. Parent lookups are even less frequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it. - Range: Chains don't send batches for processing to the beacon processor. This is easily done by guarding the channel to the beacon processor and giving it access only if the ee is responsive. I find this the simplest and most powerful approach since we don't need to deal with new sync states and chain segments that are added while the ee is offline will follow the same logic without needing to synchronize a shared state among those. Another advantage of passive pause vs active pause is that we can still keep track of active advertised chain segments so that on resume we don't need to re-evaluate all our peers. - Backfill: Not affected by ee states, we don't pause. #### Resume Sync - Block lookups: Enabled again. - Parent lookups: Enabled again. - Range: Active resume. Since the only real pause range does is not sending batches for processing, resume makes all chains that are holding read-for-processing batches send them. - Backfill: Not affected by ee states, no need to resume. ## Additional Info QUESTION: Originally I made this to notify and change on synced state, but @pawanjay176 on talks with @paulhauner concluded we only need to check online/offline states. The upcheck function mentions extra checks to have a very up to date sync status to aid the networking stack. However, the only need the networking stack would have is this one. I added a TODO to review if the extra check can be removed Next gen of #3094 Will work best with #3439 Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-08-24 23:34:56 +00:00
Michael Sproul	aab4a8d2f2	Update docs for mainnet merge release (#3494 ) ## Proposed Changes Update the merge migration docs to encourage updating mainnet configs _now_! The docs are also updated to recommend _against_ `--suggested-fee-recipient` on the beacon node (https://github.com/sigp/lighthouse/issues/3432). Additionally the `--help` for the CLI is updated to match with a few small semantic changes: - `--execution-jwt` is no longer allowed without `--execution-endpoint`. We've ended up without a default for `--execution-endpoint`, so I think that's fine. - The flags related to the JWT are only allowed if `--execution-jwt` is provided.	2022-08-23 03:50:58 +00:00
Paul Hauner	18c61a5e8b	v3.0.0 (#3464 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v3.0.0 ## Additional Info - ~~Blocked on #3439~~ - ~~Blocked on #3459~~ - ~~Blocked on #3463~~ - ~~Blocked on #3462~~ - ~~Requires further testing~~ Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2022-08-22 03:43:08 +00:00
Paul Hauner	931153885c	Run per-slot fork choice at a further distance from the head (#3487 ) ## Issue Addressed NA ## Proposed Changes Run fork choice when the head is 256 slots from the wall-clock slot, rather than 4. The reason we don't always run FC is so that it doesn't slow us down during sync. As the comments state, setting the value to 256 means that we'd only have one interrupting fork-choice call if we were syncing at 20 slots/sec. ## Additional Info NA	2022-08-19 04:27:24 +00:00
Paul Hauner	df358b864d	Add metrics for EE `PayloadStatus` returns (#3486 ) ## Issue Addressed NA ## Proposed Changes Adds some metrics so we can track payload status responses from the EE. I think this will be useful for troubleshooting and alerting. I also bumped the `BecaonChain::per_slot_task` to `debug` since it doesn't seem too noisy and would have helped us with some things we were debugging in the past. ## Additional Info NA	2022-08-19 04:27:23 +00:00
Paul Hauner	043fa2153e	Revise EE peer penalites (#3485 ) ## Issue Addressed NA ## Proposed Changes Don't penalize peers for errors that might be caused by an honest optimistic node. ## Additional Info NA	2022-08-19 04:27:22 +00:00
Michael Sproul	8255c8682e	Align engine API timeouts with spec (#3470 ) ## Proposed Changes Match the timeouts from the `execution-apis` spec. Our existing values were already quite close so I don't imagine this change to be very disruptive. The spec sets the timeout for `engine_getPayloadV1` to only 1 second, but we were already using a longer value of 2 seconds. I've kept the 2 second timeout as I don't think there's any need to fail faster when producing a payload. There's no timeout specified for `eth_syncing` so I've matched it to the shortest timeout from the spec (1 second). I think the previous value of 250ms was likely too low and could have been contributing to spurious timeouts, particularly for remote ELs. ## Additional Info The timeouts are defined on each endpoint in this document: https://github.com/ethereum/execution-apis/blob/main/src/engine/specification.md	2022-08-17 02:36:39 +00:00
Michael Sproul	e5fc9f26bc	Log if no execution endpoint is configured (#3467 ) ## Issue Addressed Fixes an issue whereby syncing a post-merge network without an execution endpoint would silently stall. Sync swallows the errors from block verification so previously there was no indication in the logs for why the node couldn't sync. ## Proposed Changes Add an error log to the merge-readiness notifier for the case where the merge has already completed but no execution endpoint is configured.	2022-08-15 01:31:02 +00:00
Michael Sproul	25e3dc9300	Fix block verification and checkpoint sync caches (#3466 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2962 ## Proposed Changes Build all caches on the checkpoint state before storing it in the database. Additionally, fix a bug in `signature_verify_chain_segment` which prevented block verification from succeeding unless the previous epoch cache was already built. The previous epoch cache is required to verify the signatures of attestations included from previous epochs, even when all the blocks in the segment are from the same epoch. The comments around `signature_verify_chain_segment` have also been updated to reflect the fact that it should only be used on a chain of blocks from a single epoch. I believe this restriction had already been added at some point in the past and that the current comments were just outdated (and I think because the proposer shuffling can change in the next epoch based on the blocks applied in the current epoch that this limitation is essential).	2022-08-15 01:31:00 +00:00
Paul Hauner	f03f9ba680	Increase merge-readiness lookhead (#3463 ) ## Issue Addressed NA ## Proposed Changes Start issuing merge-readiness logs 2 weeks before the Bellatrix fork epoch. Additionally, if the Bellatrix epoch is specified and the use has configured an EL, always log merge readiness logs, this should benefit pro-active users. ### Lookahead Reasoning - Bellatrix fork is: - epoch 144896 - slot 4636672 - Unix timestamp: `1606824023 + (4636672 * 12) = 1662464087` - GMT: Tue Sep 06 2022 11:34:47 GMT+0000 - Warning start time is: - Unix timestamp: `1662464087 - 604800 * 2 = 1661254487` - GMT: Tue Aug 23 2022 11:34:47 GMT+0000 The [current expectation](https://discord.com/channels/595666850260713488/745077610685661265/1007445305198911569) is that EL and CL clients will releases out by Aug 22nd at the latest, then an EF announcement will go out on the 23rd. If all goes well, LH will start alerting users about merge-readiness just after the announcement. ## Additional Info NA	2022-08-15 01:30:59 +00:00
Michael Sproul	92d597ad23	Modularise slasher backend (#3443 ) ## Proposed Changes Enable multiple database backends for the slasher, either MDBX (default) or LMDB. The backend can be selected using `--slasher-backend={lmdb,mdbx}`. ## Additional Info In order to abstract over the two library's different handling of database lifetimes I've used `Box::leak` to give the `Environment` type a `'static` lifetime. This was the only way I could think of using 100% safe code to construct a self-referential struct `SlasherDB`, where the `OpenDatabases` refers to the `Environment`. I think this is OK, as the `Environment` is expected to live for the life of the program, and both database engines leave the database in a consistent state after each write. The memory claimed for memory-mapping will be freed by the OS and appropriately flushed regardless of whether the `Environment` is actually dropped. We are depending on two `sigp` forks of `libmdbx-rs` and `lmdb-rs`, to give us greater control over MDBX OS support and LMDB's version.	2022-08-15 01:30:56 +00:00
Pawan Dhananjay	71fd0b42f2	Fix lints for Rust 1.63 (#3459 ) ## Issue Addressed N/A ## Proposed Changes Fix clippy lints for latest rust version 1.63. I have allowed the [derive_partial_eq_without_eq](https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq) lint as satisfying this lint would result in more code that we might not want and I feel it's not required. Happy to fix this lint across lighthouse if required though.	2022-08-12 00:56:39 +00:00
Divma	f4ffa9e0b4	Handle processing results of non faulty batches (#3439 ) ## Issue Addressed Solves #3390 So after checking some logs @pawanjay176 got, we conclude that this happened because we blacklisted a chain after trying it "too much". Now here, in all occurrences it seems that "too much" means we got too many download failures. This happened very slowly, exactly because the batch is allowed to stay alive for very long times after not counting penalties when the ee is offline. The error here then was not that the batch failed because of offline ee errors, but that we blacklisted a chain because of download errors, which we can't pin on the chain but on the peer. This PR fixes that. ## Proposed Changes Adds a missing piece of logic so that if a chain fails for errors that can't be attributed to an objectively bad behavior from the peer, it is not blacklisted. The issue at hand occurred when new peers arrived claiming a head that had wrongfully blacklisted, even if the original peers participating in the chain were not penalized. Another notable change is that we need to consider a batch invalid if it processed correctly but its next non empty batch fails processing. Now since a batch can fail processing in non empty ways, there is no need to mark as invalid previous batches. Improves some logging as well. ## Additional Info We should do this regardless of pausing sync on ee offline/unsynced state. This is because I think it's almost impossible to ensure a processing result will reach in a predictable order with a synced notification from the ee. Doing this handles what I think are inevitable data races when we actually pause sync This also fixes a return that reports which batch failed and caused us some confusion checking the logs	2022-08-12 00:56:38 +00:00
Paul Hauner	4fc0cb121c	Remove some "wontfix" TODOs for the merge (#3449 ) ## Issue Addressed NA ## Proposed Changes Removes three types of TODOs: 1. `execution_layer/src/lib.rs`: It was [determined](https://github.com/ethereum/consensus-specs/issues/2636#issuecomment-988688742) that there is no action required here. 2. `beacon_processor/worker/gossip_methods.rs`: Removed TODOs relating to peer scoring that have already been addressed via `epe.penalize_peer()`. - It seems `cargo fmt` wanted to adjust some things here as well 🤷 3. `proto_array_fork_choice.rs`: it would be nice to remove that useless `bool` for cleanliness, but I don't think it's something we need to do and the TODO just makes things look messier IMO. ## Additional Info There should be no functional changes to the code in this PR. There are still some TODOs lingering, those ones require actual changes or more thought.	2022-08-10 13:06:46 +00:00
Michael Sproul	4e05f19fb5	Serve Bellatrix preset in BN API (#3425 ) ## Issue Addressed Resolves #3388 Resolves #2638 ## Proposed Changes - Return the `BellatrixPreset` on `/eth/v1/config/spec` by default. - Allow users to opt out of this by providing `--http-spec-fork=altair` (unless there's a Bellatrix fork epoch set). - Add the Altair constants from #2638 and make serving the constants non-optional (the `http-disable-legacy-spec` flag is deprecated). - Modify the VC to only read the `Config` and not to log extra fields. This prevents it from having to muck around parsing the `ConfigAndPreset` fields it doesn't need. ## Additional Info This change is backwards-compatible for the VC and the BN, but is marked as a breaking change for the removal of `--http-disable-legacy-spec`. I tried making `Config` a `superstruct` too, but getting the automatic decoding to work was a huge pain and was going to require a lot of hacks, so I gave up in favour of keeping the default-based approach we have now.	2022-08-10 07:52:59 +00:00

... 2 3 4 5 6 ...

2278 Commits