lighthouse

Author	SHA1	Message	Date
Michael Sproul	fa8b920dd8	Merge branch 'capella' into unstable	2023-02-22 10:25:45 +11:00
Paul Hauner	9c81be8ac4	Fix metric (#4020 )	2023-02-22 09:46:45 +11:00
Pawan Dhananjay	bb5285ac6d	Remove BeaconBlockAndBlobsSidecar from core topics (#4016 )	2023-02-22 09:45:38 +11:00
Mac L	3642efe76a	Cache validator balances and allow them to be served over the HTTP API (#3863 ) ## Issue Addressed #3804 ## Proposed Changes - Add `total_balance` to the validator monitor and adjust the number of historical epochs which are cached. - Allow certain values in the cache to be served out via the HTTP API without requiring a state read. ## Usage ``` curl -X POST "http://localhost:5052/lighthouse/ui/validator_info" -d '{"indices": [0]}' -H "Content-Type: application/json" \| jq ``` ``` { "data": { "validators": { "0": { "info": [ { "epoch": 172981, "total_balance": 36566388519 }, ... { "epoch": 172990, "total_balance": 36566496513 } ] }, "1": { "info": [ { "epoch": 172981, "total_balance": 36355797968 }, ... { "epoch": 172990, "total_balance": 36355905962 } ] } } } } ``` ## Additional Info This requires no historical states to operate which mean it will still function on the freshly checkpoint synced node, however because of this, the values will populate each epoch (up to a maximum of 10 entries). Another benefit of this method, is that we can easily cache any other values which would normally require a state read and serve them via the same endpoint. However, we would need be cautious about not overly increasing block processing time by caching values from complex computations. This also caches some of the validator metrics directly, rather than pulling them from the Prometheus metrics when the API is called. This means when the validator count exceeds the individual monitor threshold, the cached values will still be available. Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-02-21 20:54:55 +00:00
Paul Hauner	eed7d65ce7	Allow for withdrawals in max block size (#4011 ) * Allow for withdrawals in max block size * Ensure payload size is counted	2023-02-21 18:03:10 +11:00
Paul Hauner	d53d43844c	Suggestions for Capella `beacon_chain` (#3999 ) * Remove CapellaReadiness::NotSynced Some EEs have a habit of flipping between synced/not-synced, which causes some spurious "Not read for the merge" messages back before the merge. For the merge, if the EE wasn't synced the CE simple wouldn't go through the transition (due to optimistic sync stuff). However, we don't have that hard requirement for Capella; the CE will go through the fork and just wait for the EE to catch up. I think that removing `NotSynced` here will avoid false-positives on the "Not ready logs..". We'll be creating other WARN/ERRO logs if the EE isn't synced, anyway. * Change some Capella readiness logging There's two changes here: 1. Shorten the log messages, for readability. 2. Change the hints. Connecting a Capella-ready LH to a non-Capella-ready EE gives this log: ``` WARN Not ready for Capella info: The execution endpoint does not appear to support the required engine api methods for Capella: Required Methods Unsupported: engine_getPayloadV2 engine_forkchoiceUpdatedV2 engine_newPayloadV2, service: slot_notifier ``` This variant of error doesn't get a "try updating" style hint, when it's the one that needs it. This is because we detect the method-not-found reponse from the EE and return default capabilities, rather than indicating that the request fails. I think it's fair to say that an EE upgrade is required whenever it doesn't provide the required methods. I changed the `ExchangeCapabilitiesFailed` message since that can only happen when the EE fails to respond with anything other than success or not-found.	2023-02-21 11:05:36 +11:00
Michael Sproul	0b6850221e	Fix Capella schema downgrades (#4004 )	2023-02-20 17:50:42 +11:00
Michael Sproul	066c27750a	Merge remote-tracking branch 'origin/staging' into capella-update	2023-02-17 12:05:36 +11:00
Paul Hauner	4aa8a2ab12	Suggestions for Capella `execution_layer` (#3983 ) * Restrict Engine::request to FnOnce * Use `Into::into` * Impl IntoIterator for VariableList * Use Instant rather than SystemTime	2023-02-17 11:58:33 +11:00
Divma	ffeb8b6e05	blacklist tests in windows (#3961 ) ## Issue Addressed Windows tests for subscription and unsubscriptions fail in CI sporadically. We usually ignore this failures, so this PR aims to help reduce the failure noise. Associated issue is https://github.com/sigp/lighthouse/issues/3960	2023-02-16 23:34:30 +00:00
Michael Sproul	461bda6e85	Execution engine suggestions from code review Co-authored-by: Paul Hauner <paul@paulhauner.com>	2023-02-16 16:54:05 +11:00
realbigsean	55753f8bc8	bump recursion limit	2023-02-15 16:32:50 -05:00
realbigsean	ca8e341649	fix compilation after merge	2023-02-15 14:30:39 -05:00
realbigsean	8320b918ae	merge self limiter	2023-02-15 14:26:18 -05:00
realbigsean	4d0b0f681d	merge self limiter	2023-02-15 14:25:58 -05:00
realbigsean	b805fa6279	merge with upstream	2023-02-15 14:20:12 -05:00
Emilia Hane	aaf6404d4f	Remove unused generic	2023-02-15 17:45:22 +01:00
Emilia Hane	2672cf40bb	Better fix for debug tests	2023-02-15 11:47:56 +01:00
realbigsean	44dbccfeae	add v3 to capabilities	2023-02-15 09:23:59 +01:00
Emilia Hane	13efd47238	fixup! Disable use of system time in tests	2023-02-15 09:20:30 +01:00
Michael Sproul	918b688f72	Simplify payload traits and reduce cloning (#3976 ) * Simplify payload traits and reduce cloning * Fix self limiter	2023-02-15 14:17:56 +11:00
Emilia Hane	9e4abc79fb	Comment out tests that use system time	2023-02-14 14:12:50 +01:00
Emilia Hane	73c7ad73b8	Disable use of system time in tests	2023-02-14 13:33:38 +01:00
Emilia Hane	148385eb70	Remove unused error	2023-02-14 12:43:13 +01:00
Age Manning	8dd9249177	Enforce a timeout on peer disconnect (#3757 ) On heavily crowded networks, we are seeing many attempted connections to our node every second. Often these connections come from peers that have just been disconnected. This can be for a number of reasons including: - We have deemed them to be not as useful as other peers - They have performed poorly - They have dropped the connection with us - The connection was spontaneously lost - They were randomly removed because we have too many peers In all of these cases, if we have reached or exceeded our target peer limit, there is no desire to accept new connections immediately after the disconnect from these peers. In fact, it often costs us resources to handle the established connections and defeats some of the logic of dropping them in the first place. This PR adds a timeout, that prevents recently disconnected peers from reconnecting to us. Technically we implement a ban at the swarm layer to prevent immediate re connections for at least 10 minutes. I decided to keep this light, and use a time-based LRUCache which only gets updated during the peer manager heartbeat to prevent added stress of polling a delay map for what could be a large number of peers. This cache is bounded in time. An extra space bound could be added should people consider this a risk. Co-authored-by: Diva M <divma@protonmail.com>	2023-02-14 03:25:42 +00:00
Michael Sproul	f7bd4bf06e	Update block rewards API for Capella	2023-02-14 12:09:40 +11:00
Michael Sproul	d53ccf8fc7	Placeholder for BlobsByRange outbound rate limit	2023-02-14 12:08:14 +11:00
Michael Sproul	18c8cab4da	Merge remote-tracking branch 'origin/unstable' into capella-merge	2023-02-14 12:07:27 +11:00
realbigsean	d2ecbd942e	fix a couple new lints	2023-02-13 17:13:47 -05:00
realbigsean	cd8757de1c	Revert "make batch size check compile time panic" This reverts commit `68f2484efc`.	2023-02-13 16:51:55 -05:00
realbigsean	68f2484efc	make batch size check compile time panic	2023-02-13 16:51:46 -05:00
realbigsean	4c3561dcaf	make batch size check compile time panic	2023-02-13 16:50:33 -05:00
realbigsean	8f9c5cfca9	remove unused structs	2023-02-13 16:47:36 -05:00
realbigsean	ad9af6d8b1	complete match for `has_context_bytes`	2023-02-13 16:44:54 -05:00
realbigsean	fc2d07b4e3	allow unused	2023-02-13 16:36:38 -05:00
realbigsean	28702c9d5d	merge upstream, add back `get_blobs` logic	2023-02-13 16:29:21 -05:00
Michael Sproul	2f456ff9eb	Fix regression in DB write atomicity (#3931 ) ## Issue Addressed Fix a bug introduced by #3696. The bug is not expected to occur frequently, so releasing this PR is non-urgent. ## Proposed Changes * Add a variant to `StoreOp` that allows a raw KV operation to be passed around. * Return to using `self.store.do_atomically` rather than `self.store.hot_db.do_atomically`. This streamlines the write back into a single call and makes our auto-revert work again. * Prevent `import_block_update_shuffling_cache` from failing block import. This is an outstanding bug from before v3.4.0 which may have contributed to some random unexplained database corruption. ## Additional Info In #3696 I split the database write into two calls, one to convert the `StoreOp`s to `KeyValueStoreOp`s and one to write them. This had the unfortunate side-effect of damaging our atomicity guarantees in case of a write error. If the first call failed, we would be left with the block in fork choice but not on-disk (or the snapshot cache), which would prevent us from processing any descendant blocks. On `unstable` the first call is very unlikely to fail unless the disk is full, but on `tree-states` the conversion is more involved and a user reported database corruption after it failed in a way that should have been recoverable. Additionally, as @emhane observed, #3696 also inadvertently removed the import of the new block into the block cache. Although this seems like it could have negatively impacted performance, there are several mitigating factors: - For regular block processing we should almost always load the parent block (and state) from the snapshot cache. - We often load blinded blocks, which bypass the block cache anyway. - Metrics show no noticeable increase in the block cache miss rate with v3.4.0. However, I expect the block cache _will_ be useful again in `tree-states`, so it is restored to use by this PR.	2023-02-13 03:32:01 +00:00
Paul Hauner	84843d67d7	Reduce some EE and builder related ERRO logs to WARN (#3966 ) ## Issue Addressed NA ## Proposed Changes Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf". The modified logs are: #### `ERRO Execution engine call failed` I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues. #### `ERRO "Builder failed to reveal payload"` In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting. I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise. #### `ERRO "Relay error when registering validator(s)"` It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it. #### `ERRO Error fetching block for peer ExecutionLayerErrorPayloadReconstruction` This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers. ## Additional Info NA	2023-02-12 23:14:08 +00:00
ethDreamer	e743d75c9b	Update Mock Builder for Post-Capella Tests (#3958 ) * Update Mock Builder for Post-Capella Tests * Add _mut Suffix to BidStuff Functions * Fix Setting Gas Limit	2023-02-10 13:30:14 -06:00
Emilia Hane	28e9f07746	Fix lint for prune blobs pr	2023-02-10 16:23:04 +01:00
ethDreamer	39f8327f73	Properly Deserialize ForkVersionedResponses (#3944 ) * Move ForkVersionedResponse to consensus/types * Properly Deserialize ForkVersionedResponses * Elide Types in from_value Calls * Added Tests for ForkVersionedResponse Deserialize * Address Sean's Comments & Make Less Restrictive * Utilize `map_fork_name!`	2023-02-10 08:49:25 -06:00
Emilia Hane	0104d6143c	fixup! Fix latest clippy lints	2023-02-10 15:35:01 +01:00
Emilia Hane	02cca3478b	Fix conflicts rebasing eip4844	2023-02-10 15:35:01 +01:00
Emilia Hane	615402abcf	fixup! Fix conflicts rebasing eip4844	2023-02-10 15:35:00 +01:00
Emilia Hane	db36eb978b	Fix latest clippy lints	2023-02-10 15:35:00 +01:00
Emilia Hane	2653f88b5f	Fix conflicts rebasing eip4844	2023-02-10 15:35:00 +01:00
Emilia Hane	43bf908e7a	Fix release tests	2023-02-10 15:34:59 +01:00
Emilia Hane	4d3ff347a3	Fixes after rebasing eip4844	2023-02-10 15:34:58 +01:00
Emilia Hane	5437dcae9c	Fix conflicts rebasing eip4844	2023-02-10 15:34:58 +01:00
Emilia Hane	7545ae9e9b	fixup! Fix block lookup debug tests	2023-02-10 15:34:46 +01:00
Emilia Hane	6beca6defc	Fix range sync tests	2023-02-10 09:41:24 +01:00
Emilia Hane	e9e198a2b6	Fix conflicts rebasing eip4844	2023-02-10 09:41:23 +01:00
Emilia Hane	d292a3a6a8	Fix conflicts rebasing eip4844	2023-02-10 09:41:23 +01:00
Emilia Hane	994990063a	Fix weak_subjectivity_sync test	2023-02-10 09:41:23 +01:00
Emilia Hane	09370e70d9	Fix rebase conflicts	2023-02-10 09:41:19 +01:00
Emilia Hane	8365d76277	fixup! Debug tests	2023-02-10 09:39:22 +01:00
Emilia Hane	16cb9cfca2	fixup! Debug tests	2023-02-10 09:39:22 +01:00
Emilia Hane	7220f35ff6	Debug tests	2023-02-10 09:39:21 +01:00
Emilia Hane	995b2715f2	Fix network block_lookups test	2023-02-10 09:39:21 +01:00
Emilia Hane	3676ce78b5	Fix rebase conflicts	2023-02-10 09:39:21 +01:00
Michael Sproul	c9354a9d25	Tweaks to reward APIs (#3957 ) ## Proposed Changes * Return the effective balance in gwei to align with the spec ([ideal attestation rewards](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Rewards/getAttestationsRewards)). * Use quoted `i64`s for attestation and sync committee rewards.	2023-02-10 06:19:42 +00:00
Paul Hauner	5276dd0cb0	Fix edge-case when finding the finalized descendant (#3924 ) ## Issue Addressed NA ## Description We were missing an edge case when checking to see if a block is a descendant of the finalized checkpoint. This edge case is described for one of the tests in this PR: `a119edc739/consensus/proto_array/src/proto_array_fork_choice.rs (L1018-L1047)` This bug presented itself in the following mainnet log: ``` Jan 26 15:12:42.841 ERRO Unable to validate attestation error: MissingBeaconState(0x7c30cb80ec3d4ec624133abfa70e4c6cfecfca456bfbbbff3393e14e5b20bf25), peer_id: 16Uiu2HAm8RPRciXJYtYc5c3qtCRdrZwkHn2BXN3XP1nSi1gxHYit, type: "unaggregated", slot: Slot(5660161), beacon_block_root: 0x4a45e59da7cb9487f4836c83bdd1b741b4f31c67010c7ae343fa6771b3330489 ``` Here the BN is rejecting an attestation because of a "missing beacon state". Whilst it was correct to reject the attestation, it should have rejected it because it attests to a block that conflicts with finality rather than claiming that the database is inconsistent. The block that this attestation points to (`0x4a45`) is block `C` in the above diagram. It is a non-canonical block in the first slot of an epoch that conflicts with the finalized checkpoint. Due to our lazy pruning of proto array, `0x4a45` was still present in proto-array. Our missed edge-case in [`ForkChoice::is_descendant_of_finalized`](`38514c07f2/consensus/fork_choice/src/fork_choice.rs (L1375-L1379)`) would have indicated to us that the block is a descendant of the finalized block. Therefore, we would have accepted the attestation thinking that it attests to a descendant of the finalized checkpoint. Since we didn't have the shuffling for this erroneously processed block, we attempted to read its state from the database. This failed because we prune states from the database by keeping track of the tips of the chain and iterating back until we find a finalized block. This would have deleted `C` from the database, hence the `MissingBeaconState` error.	2023-02-09 23:51:18 +00:00
Emilia Hane	6a37e84399	fixup! Fix regression in DB write atomicity	2023-02-08 11:44:46 +01:00
Emilia Hane	bc468b4ce5	fixup! Improve use of whitespace	2023-02-08 11:44:45 +01:00
Michael Sproul	ac4b5b580c	Fix regression in DB write atomicity	2023-02-08 11:44:45 +01:00
Emilia Hane	9d919917f5	Removed unused code	2023-02-08 11:44:45 +01:00
Emilia Hane	d7eb9441cf	Reorder loading of db metadata from disk to allow for future changes to schema	2023-02-08 11:44:45 +01:00
Emilia Hane	d599e41f3d	Remove debug comment Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-08 11:44:44 +01:00
Emilia Hane	577262ccbf	Improve use of whitespace Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-08 11:44:44 +01:00
Emilia Hane	56c84178f2	Fix conflicts rebasing eip4844	2023-02-08 11:44:44 +01:00
Emilia Hane	b2abec5d35	Verify StoreConfig	2023-02-08 11:44:44 +01:00
Emilia Hane	00ca21e84c	Make implementation of BlobInfo more coder friendly	2023-02-08 11:44:43 +01:00
Emilia Hane	8f137df02e	fixup! Allow user to set an epoch margin for pruning	2023-02-08 11:44:43 +01:00
Emilia Hane	a2eda76291	Correct comment	2023-02-08 11:44:43 +01:00
Emilia Hane	9ee9b6df76	Remove unused stuff	2023-02-08 11:44:42 +01:00
Emilia Hane	6dff69bde9	Atomically update blob info with pruned blobs	2023-02-08 11:44:42 +01:00
Emilia Hane	5d2480c762	Improve naming	2023-02-08 11:44:42 +01:00
Emilia Hane	9c2e623555	Reflect use of prune margin epochs at import	2023-02-08 11:44:42 +01:00
Emilia Hane	d4795601f2	fixup! Prune from highest data availability boundary	2023-02-08 11:44:41 +01:00
Emilia Hane	43c3c74a48	fixup! Fix blobs store bug	2023-02-08 11:44:41 +01:00
Emilia Hane	63ca3bfb29	Prune from highest data availability boundary	2023-02-08 11:44:41 +01:00
Emilia Hane	c50f83116e	Fix wording Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-08 11:44:41 +01:00
Emilia Hane	f6346f89c1	Clarify comment Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-08 11:44:41 +01:00
Emilia Hane	e4b447395a	Clarify wording Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-08 11:44:40 +01:00
Emilia Hane	756c881857	Keep uniform size small keys Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-02-08 11:44:40 +01:00
Emilia Hane	4de523fb75	fixup! Allow user to set an epoch margin for pruning	2023-02-08 11:44:40 +01:00
Emilia Hane	1812301c9c	Allow user to set an epoch margin for pruning	2023-02-08 11:44:40 +01:00
Emilia Hane	d7fc24a9d5	Plug in running blob pruning in migrator, related bug fixes and add todos	2023-02-08 11:44:40 +01:00
Emilia Hane	0bdc291490	Only store non-empty orphaned blobs	2023-02-08 11:44:39 +01:00
Emilia Hane	caa04db58a	Run prune blobs on migrator thread	2023-02-08 11:44:39 +01:00
Emilia Hane	a875bec5f2	Fix blobs store bug	2023-02-08 11:44:39 +01:00
Emilia Hane	3bede06c9b	Fix typo	2023-02-08 11:44:38 +01:00
Emilia Hane	54699f808c	fixup! Clarify hybrid blob prune solution and fix error handling	2023-02-08 11:44:38 +01:00
Emilia Hane	83a9520761	Clarify hybrid blob prune solution and fix error handling	2023-02-08 11:44:38 +01:00
Emilia Hane	3d93dad0e2	Fix type bug Co-authored-by: realbigsean <seananderson33@GMAIL.com>	2023-02-08 11:44:37 +01:00
Emilia Hane	44ec331452	fixup! Simplify conceptual design	2023-02-08 11:44:37 +01:00
Emilia Hane	20567750c1	fixup! Simplify conceptual design	2023-02-08 11:44:37 +01:00
Emilia Hane	7103a257ce	Simplify conceptual design	2023-02-08 11:44:37 +01:00
Emilia Hane	0d13932663	Fix epoch constructor misconception	2023-02-08 11:44:37 +01:00
Emilia Hane	b5abfe620a	Convert epochs_per_blob_prune to Epoch once	2023-02-08 11:44:36 +01:00

1 2 3 4 5 ...

2512 Commits