lighthouse

Author	SHA1	Message	Date
zhiqiangxu	f1ac12f23a	Fix some typos (#4565 )	2023-08-14 00:29:43 +00:00
Eitan Seri-Levi	1fcada8a32	Improve transport connection errors (#4540 ) ## Issue Addressed #4538 ## Proposed Changes add newtype wrapper around DialError that extracts error messages and logs them in a more readable format ## Additional Info I was able to test Transport Dial Errors in the situation where a libp2p instance attempts to ping a nonexistent peer. That error message should look something like `A transport level error has ocurred: Connection refused (os error 61)` AgeManning mentioned we should try fetching only the most inner error (in situations where theres a nested error). I took a stab at implementing that For non transport DialErrors, I wrote out the error messages explicitly (as per the docs). Could potentially clean things up here if thats not necessary Co-authored-by: Age Manning <Age@AgeManning.com>	2023-08-10 00:10:09 +00:00
Paul Hauner	b60304b19f	Use `BeaconProcessor` for API requests (#4462 ) ## Issue Addressed NA ## Proposed Changes Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` ## New CLI Flags I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. ## Additional Info I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](`efbabe3252`). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.	2023-08-08 23:30:15 +00:00
Eitan Seri-Levi	521432129d	Support SSZ request body for POST /beacon/blinded_blocks endpoints (v1 & v2) (#4504 ) ## Issue Addressed #4262 ## Proposed Changes add SSZ support in request body for POST /beacon/blinded_blocks endpoints (v1 & v2) ## Additional Info	2023-08-07 22:53:04 +00:00
Armağan Yıldırak	3397612160	Shift networking configuration (#4426 ) ## Issue Addressed Addresses [#4401](https://github.com/sigp/lighthouse/issues/4401) ## Proposed Changes Shift some constants into ```ChainSpec``` and remove the constant values from code space. ## Additional Info I mostly used ```MainnetEthSpec::default_spec()``` for getting ```ChainSpec```. I wonder Did I make a mistake about that. Co-authored-by: armaganyildirak <armaganyildirak@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com> Co-authored-by: Age Manning <Age@AgeManning.com> Co-authored-by: Diva M <divma@protonmail.com>	2023-08-03 01:51:47 +00:00
zhiqiangxu	fcf51d691e	fix typo (#4555 )	2023-08-02 23:50:41 +00:00
Divma	ff9b09d964	upgrade to libp2p 0.52 (#4431 ) ## Issue Addressed Upgrade libp2p to v0.52 ## Proposed Changes - Workflows: remove installation of `protoc` - Book: remove installation of `protoc` - `Dockerfile`s and `cross`: remove custom base `Dockerfile` for cross since it's no longer needed. Remove `protoc` from remaining `Dockerfiles`s - Upgrade `discv5` to `v0.3.1`: we have some cool stuff in there: no longer needs `protoc` and faster ip updates on cold start - Upgrade `prometheus` to `0.21.0`, now it no longer needs encoding checks - things that look like refactors: bunch of api types were renamed and need to be accessed in a different (clearer) way - Lighthouse network - connection limits is now a behaviour - banned peers no longer exist on the swarm level, but at the behaviour level - `connection_event_buffer_size` now is handled per connection with a buffer size of 4 - `mplex` is deprecated and was removed - rpc handler now logs the peer to which it belongs ## Additional Info Tried to keep as much behaviour unchanged as possible. However, there is a great deal of improvements we can do _after_ this upgrade: - Smart connection limits: Connection limits have been checked only based on numbers, we can now use information about the incoming peer to decide if we want it - More powerful peer management: Dial attempts from other behaviours can be rejected early - Incoming connections can be rejected early - Banning can be returned exclusively to the peer management: We should not get connections to banned peers anymore making use of this - TCP Nat updates: We might be able to take advantage of confirmed external addresses to check out tcp ports/ips Co-authored-by: Age Manning <Age@AgeManning.com> Co-authored-by: Akihito Nakano <sora.akatsuki@gmail.com>	2023-08-02 00:59:34 +00:00
Gua00va	73764d0dd2	Deprecate `exchangeTransitionConfiguration` functionality (#4517 ) ## Issue Addressed Solves #4442 ## Proposed Changes EL clients log errors if we don't query this endpoint, but they are making releases that remove this error logging. After those are out we can stop calling it, after which point EL teams will remove the endpoint entirely. Refer https://hackmd.io/@n0ble/deprecate-exchgTC	2023-07-31 23:51:39 +00:00
Eitan Seri-Levi	e8c411c288	add ssz support in request body for /beacon/blocks endpoints (v1 & v2) (#4479 ) ## Issue Addressed [#4457](https://github.com/sigp/lighthouse/issues/4457) ## Proposed Changes add ssz support in request body for /beacon/blocks endpoints (v1 & v2) ## Additional Info	2023-07-31 23:51:37 +00:00
Age Manning	8654f20028	Development feature flag - Disable backfill (#4537 ) Often when testing I have to create a hack which is annoying to maintain. I think it might be handy to add a custom compile-time flag that developers can use if they want to test things locally without having to backfill a bunch of blocks. There is probably an argument to have a feature called "backfill" which is enabled by default and can be disabled. I didn't go this route because I think it's counter-intuitive to have a feature that enables a core and necessary behaviour.	2023-07-31 01:53:08 +00:00
Gua00va	117802cef1	Add Eth Version Header (#4528 ) ## Issue Addressed Closes #4525 ## Proposed Changes `GET /eth/v1/validator/blinded_blocks` endpoint and `GET /eth/v1/validator/blocks` now send `Eth-Version` header. Co-authored-by: Gua00va <105484243+Gua00va@users.noreply.github.com>	2023-07-31 01:53:07 +00:00
Jimmy Chen	b5337c0ea5	Fix incorrect ideal rewards calculation (#4520 ) ## Issue Addressed The PR fixes a bug where the the ideal rewards for source and head were incorrectly set. Output from testing a validator that performed optimally in a Phase 0 epoch , note the `source` and `target` under ideal rewards is incorrect (compared to the actual `total_rewards` below): ```json { "ideal_rewards": [ ... { "effective_balance": "32000000000", "head": "18771", "target": "18770", "source": "18729", "inclusion_delay": "17083", "inactivity": "0" } ], "total_rewards": [ { "validator_index": "0", "head": "18729", "target": "18770", "source": "18771", "inclusion_delay": "17083", "inactivity": "0" } ] ```	2023-07-31 01:53:06 +00:00
Aoi Kurokawa	85a3340d0e	Implement liveness BeaconAPI (#4343 ) ## Issue Addressed #4243 ## Proposed Changes - create a new endpoint for liveness/{endpoint} ## Additional Info This is my first PR.	2023-07-31 01:53:03 +00:00
Jimmy Chen	fc7f1ba6b9	Phase 0 attestation rewards via Beacon API (#4474 ) ## Issue Addressed Addresses #4026. Beacon-API spec [here](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Beacon/getAttestationsRewards). Endpoint: `POST /eth/v1/beacon/rewards/attestations/{epoch}` This endpoint already supports post-Altair epochs. This PR adds support for phase 0 rewards calculation. ## Proposed Changes - [x] Attestation rewards API to support phase 0 rewards calculation, re-using logic from `state_processing`. Refactored `get_attestation_deltas` slightly to support computing deltas for a subset of validators. - [x] Add `inclusion_delay` to `ideal_rewards` (`beacon-API` spec update to follow) - [x] Add `inactivity` penalties to both `ideal_rewards` and `total_rewards` (`beacon-API` spec update to follow) - [x] Add tests to compute attestation rewards and compare results with beacon states ## Additional Notes - The extra penalty for missing attestations or being slashed during an inactivity leak is currently not included in the API response (for both phase 0 and Altair) in the spec. - I went with adding `inactivity` as a separate component rather than combining them with the 4 rewards, because this is how it was grouped in [the phase 0 spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md#get_attestation_deltas). During inactivity leak, all rewards include the optimal reward, and inactivity penalties are calculated separately (see below code snippet from the spec), so it would be quite confusing if we merge them. This would also work better with Altair, because there's no "cancelling" of rewards and inactivity penalties are more separate. - Altair calculation logic (to include inactivity penalties) to be updated in a follow-up PR. ```python def get_attestation_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]: """ Return attestation reward/penalty deltas for each validator. """ source_rewards, source_penalties = get_source_deltas(state) target_rewards, target_penalties = get_target_deltas(state) head_rewards, head_penalties = get_head_deltas(state) inclusion_delay_rewards, _ = get_inclusion_delay_deltas(state) _, inactivity_penalties = get_inactivity_penalty_deltas(state) rewards = [ source_rewards[i] + target_rewards[i] + head_rewards[i] + inclusion_delay_rewards[i] for i in range(len(state.validators)) ] penalties = [ source_penalties[i] + target_penalties[i] + head_penalties[i] + inactivity_penalties[i] for i in range(len(state.validators)) ] return rewards, penalties ``` ## Example API Response <details> <summary>Click me</summary> ```json { "ideal_rewards": [ { "effective_balance": "1000000000", "head": "6638", "target": "6638", "source": "6638", "inclusion_delay": "9783", "inactivity": "0" }, { "effective_balance": "2000000000", "head": "13276", "target": "13276", "source": "13276", "inclusion_delay": "19565", "inactivity": "0" }, { "effective_balance": "3000000000", "head": "19914", "target": "19914", "source": "19914", "inclusion_delay": "29349", "inactivity": "0" }, { "effective_balance": "4000000000", "head": "26553", "target": "26553", "source": "26553", "inclusion_delay": "39131", "inactivity": "0" }, { "effective_balance": "5000000000", "head": "33191", "target": "33191", "source": "33191", "inclusion_delay": "48914", "inactivity": "0" }, { "effective_balance": "6000000000", "head": "39829", "target": "39829", "source": "39829", "inclusion_delay": "58697", "inactivity": "0" }, { "effective_balance": "7000000000", "head": "46468", "target": "46468", "source": "46468", "inclusion_delay": "68480", "inactivity": "0" }, { "effective_balance": "8000000000", "head": "53106", "target": "53106", "source": "53106", "inclusion_delay": "78262", "inactivity": "0" }, { "effective_balance": "9000000000", "head": "59744", "target": "59744", "source": "59744", "inclusion_delay": "88046", "inactivity": "0" }, { "effective_balance": "10000000000", "head": "66383", "target": "66383", "source": "66383", "inclusion_delay": "97828", "inactivity": "0" }, { "effective_balance": "11000000000", "head": "73021", "target": "73021", "source": "73021", "inclusion_delay": "107611", "inactivity": "0" }, { "effective_balance": "12000000000", "head": "79659", "target": "79659", "source": "79659", "inclusion_delay": "117394", "inactivity": "0" }, { "effective_balance": "13000000000", "head": "86298", "target": "86298", "source": "86298", "inclusion_delay": "127176", "inactivity": "0" }, { "effective_balance": "14000000000", "head": "92936", "target": "92936", "source": "92936", "inclusion_delay": "136959", "inactivity": "0" }, { "effective_balance": "15000000000", "head": "99574", "target": "99574", "source": "99574", "inclusion_delay": "146742", "inactivity": "0" }, { "effective_balance": "16000000000", "head": "106212", "target": "106212", "source": "106212", "inclusion_delay": "156525", "inactivity": "0" }, { "effective_balance": "17000000000", "head": "112851", "target": "112851", "source": "112851", "inclusion_delay": "166307", "inactivity": "0" }, { "effective_balance": "18000000000", "head": "119489", "target": "119489", "source": "119489", "inclusion_delay": "176091", "inactivity": "0" }, { "effective_balance": "19000000000", "head": "126127", "target": "126127", "source": "126127", "inclusion_delay": "185873", "inactivity": "0" }, { "effective_balance": "20000000000", "head": "132766", "target": "132766", "source": "132766", "inclusion_delay": "195656", "inactivity": "0" }, { "effective_balance": "21000000000", "head": "139404", "target": "139404", "source": "139404", "inclusion_delay": "205439", "inactivity": "0" }, { "effective_balance": "22000000000", "head": "146042", "target": "146042", "source": "146042", "inclusion_delay": "215222", "inactivity": "0" }, { "effective_balance": "23000000000", "head": "152681", "target": "152681", "source": "152681", "inclusion_delay": "225004", "inactivity": "0" }, { "effective_balance": "24000000000", "head": "159319", "target": "159319", "source": "159319", "inclusion_delay": "234787", "inactivity": "0" }, { "effective_balance": "25000000000", "head": "165957", "target": "165957", "source": "165957", "inclusion_delay": "244570", "inactivity": "0" }, { "effective_balance": "26000000000", "head": "172596", "target": "172596", "source": "172596", "inclusion_delay": "254352", "inactivity": "0" }, { "effective_balance": "27000000000", "head": "179234", "target": "179234", "source": "179234", "inclusion_delay": "264136", "inactivity": "0" }, { "effective_balance": "28000000000", "head": "185872", "target": "185872", "source": "185872", "inclusion_delay": "273918", "inactivity": "0" }, { "effective_balance": "29000000000", "head": "192510", "target": "192510", "source": "192510", "inclusion_delay": "283701", "inactivity": "0" }, { "effective_balance": "30000000000", "head": "199149", "target": "199149", "source": "199149", "inclusion_delay": "293484", "inactivity": "0" }, { "effective_balance": "31000000000", "head": "205787", "target": "205787", "source": "205787", "inclusion_delay": "303267", "inactivity": "0" }, { "effective_balance": "32000000000", "head": "212426", "target": "212426", "source": "212426", "inclusion_delay": "313050", "inactivity": "0" } ], "total_rewards": [ { "validator_index": "0", "head": "212426", "target": "212426", "source": "212426", "inclusion_delay": "313050", "inactivity": "0" }, { "validator_index": "32", "head": "212426", "target": "212426", "source": "212426", "inclusion_delay": "313050", "inactivity": "0" }, { "validator_index": "63", "head": "-357771", "target": "-357771", "source": "-357771", "inclusion_delay": "0", "inactivity": "0" } ] } ``` </details>	2023-07-18 01:48:40 +00:00
Divma	4435a22221	Cleanup unreachable code in `lcli::generate_bootnode_enr` and some tests (#4485 ) ## Issue Addressed n/a Noticed this while working on something else ## Proposed Changes - leverage the appropriate types to avoid a bunch of `unwrap` and errors ## Additional Info n/a	2023-07-17 05:31:53 +00:00
Pawan Dhananjay	f2223feb21	Rust 1.71 lints (#4503 ) ## Issue Addressed N/A ## Proposed Changes Add lints for rust 1.71 [3789134](`3789134ae2`) is probably the one that needs most attention as it changes beacon state code. I changed the `is_in_inactivity_leak ` function to return a `ArithError` as not all consumers of that function work well with a `BeaconState::Error`.	2023-07-17 00:14:19 +00:00
Michael Sproul	03674c7199	Update mev-rs and remove patches (#4496 ) ## Issue Addressed Fixes occasional compilation errors with mev-rs (see #4456). ## Proposed Changes - Update `mev-rs` to the latest version, which allows us to remove hacky `[patch]` sections - Update the `axum` version used in `watch` so LH only uses a single version	2023-07-17 00:14:15 +00:00
Michael Sproul	6c375205fb	Fix HTTP state API bug and add `--epochs-per-migration` (#4236 ) ## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.	2023-07-17 00:14:12 +00:00
Jack McPherson	62c9170755	Remove hidden re-exports to appease Rust 1.73 (#4495 ) ## Issue Addressed #4494 ## Proposed Changes - Remove explicit re-exports of various types to appease the new compiler lint ## Additional Info It seems `warn(hidden_glob_reexports)` is the main culprit.	2023-07-12 07:06:00 +00:00
Paul Hauner	c25825a539	Move the `BeaconProcessor` into a new crate (#4435 ) Replaces #4434. It is identical, but this PR has a smaller diff due to a curated commit history. ## Issue Addressed NA ## Proposed Changes This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate. This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`. The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for: 1. HTTP API requests. 1. Scheduled tasks in the `BeaconChain` (e.g., state advance). Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests). ## Additional Info This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the scheduling code (i.e., the `BeaconProcessor`) from the business logic in the `network` crate (i.e., the `Worker` impls). Future PRs (see #4462) can build upon these works to actually use the `BeaconProcessor` for more operations. I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.	2023-07-10 07:45:54 +00:00
Michael Sproul	ea2420d193	Bump default checkpoint sync timeout to 3 minutes (#4466 ) ## Issue Addressed [Users on Twitter](https://twitter.com/ashekhirin/status/1676334843192397824) are getting checkpoint sync URL timeouts with the default of 60s, so this PR increases the default timeout to 3 minutes. I've also added a short section to the book about adjusting the timeout with `--checkpoint-sync-url-timeout`.	2023-07-08 13:16:06 +00:00
Jack McPherson	a6d5c7d7e0	Correct checks for backfill completeness (#4465 ) ## Issue Addressed #4331 ## Proposed Changes - Use comparison rather than strict equality between the earliest epoch we know about and the backfill target (which will be the most recent WSP by default or genesis) - Add helper function `BackFillSync<T>::would_complete` to achieve this in one location ## Additional Info - There's an ad hoc test for this in #4461 Co-authored-by: Age Manning <Age@AgeManning.com>	2023-07-06 07:35:31 +00:00
Paul Hauner	dfcb3363c7	Release v4.3.0 (#4452 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2023-07-04 13:29:55 +00:00
Jimmy Chen	46be05f728	Cache target attester balances for unrealized FFG progression calculation (#4362 ) ## Issue Addressed #4118 ## Proposed Changes This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive). This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled\|checked\|strict\|fast` flag is introduced to support this: - `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. This is the default mode for now. - `strict`: enabled with checks against participation cache, returns error if there is a mismatch. Used for testing only. - `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release. - `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs. ### Tasks - [x] Initial cache implementation in `BeaconState` - [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache` - [x] Add CLI flag, and disable the optimization by default - [x] Testing on Goerli & Benchmarking - [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](https://github.com/sigp/lighthouse/pull/4362#discussion_r1230877001)) - [x] Add attesting balance metrics Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>	2023-06-30 01:13:06 +00:00
Eitan Seri-Levi	826e090f50	Update node health endpoint (#4310 ) ## Issue Addressed [#4292](https://github.com/sigp/lighthouse/issues/4292) ## Proposed Changes Updated the node health endpoint will return a 200 status code if `!syncing && !el_offline && !optimistic` wil return a 206 if `(syncing \|\| optimistic) && !el_offline` will return a 503 if `el_offline` ## Additional Info	2023-06-30 01:13:04 +00:00
Eitan Seri-Levi	edd093293a	added debounce to log (#4269 ) ## Issue Addressed [#4259](https://github.com/sigp/lighthouse/issues/4259) ## Proposed Changes debounce spammy `Unable to send message to the beacon processor` log messages ## Additional Info We could potentially debounce other logs that have the potential to be "spammy". After some feedback we decided to additionally add the following change: create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.	2023-06-30 01:13:03 +00:00
Jack McPherson	1aff082eea	Add broadcast validation routes to Beacon Node HTTP API (#4316 ) ## Issue Addressed - #4293 - #4264 ## Proposed Changes Changes largely follow those suggested in the main issue. - Add new routes to HTTP API - `post_beacon_blocks_v2` - `post_blinded_beacon_blocks_v2` - Add new routes to `BeaconNodeHttpClient` - `post_beacon_blocks_v2` - `post_blinded_beacon_blocks_v2` - Define new Eth2 common types - `BroadcastValidation`, enum representing the level of validation to apply to blocks prior to broadcast - `BroadcastValidationQuery`, the corresponding HTTP query string type for the above type - ~~Define `_checked` variants of both `publish_block` and `publish_blinded_block` that enforce a validation level at a type level~~ - Add interactive tests to the `bn_http_api_tests` test target covering each validation level (to their own test module, `broadcast_validation_tests`) - `beacon/blocks` - `broadcast_validation=gossip` - Invalid (400) - Full Pass (200) - Partial Pass (202) - `broadcast_validation=consensus` - Invalid (400) - Only gossip (400) - Only consensus pass (i.e., equivocates) (200) - Full pass (200) - `broadcast_validation=consensus_and_equivocation` - Invalid (400) - Invalid due to early equivocation (400) - Only gossip (400) - Only consensus (400) - Pass (200) - `beacon/blinded_blocks` - `broadcast_validation=gossip` - Invalid (400) - Full Pass (200) - Partial Pass (202) - `broadcast_validation=consensus` - Invalid (400) - Only gossip (400) - ~~Only consensus pass (i.e., equivocates) (200)~~ - Full pass (200) - `broadcast_validation=consensus_and_equivocation` - Invalid (400) - Invalid due to early equivocation (400) - Only gossip (400) - Only consensus (400) - Pass (200) - Add a new trait, `IntoGossipVerifiedBlock`, which allows type-level guarantees to be made as to gossip validity - Modify the structure of the `ObservedBlockProducers` cache from a `(slot, validator_index)` mapping to a `((slot, validator_index), block_root)` mapping - Modify `ObservedBlockProducers::proposer_has_been_observed` to return a `SeenBlock` rather than a boolean on success - Punish gossip peer (low) for submitting equivocating blocks - Rename `BlockError::SlashablePublish` to `BlockError::SlashableProposal` ## Additional Info This PR contains changes that directly modify how blocks are verified within the client. For more context, consult [comments in-thread](https://github.com/sigp/lighthouse/pull/4316#discussion_r1234724202). Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2023-06-29 12:02:38 +00:00
int88	23b06aa51e	avoid relocking head during builder health check (#4323 ) ## Issue Addressed #4314 ## Proposed Changes avoid relocking head during builder health check ## Additional Info NA	2023-06-29 09:39:15 +00:00
Paul Hauner	9072acbfa6	Tidy formatting of `Reqwest` errors (#4336 ) ## Issue Addressed NA ## Proposed Changes Implements the `PrettyReqwestError` to wrap a `reqwest::Error` and give nicer `Debug` formatting. It also wraps the `Url` component in a `SensitiveUrl` to avoid leaking sensitive info in logs. ### Before ``` Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(9999), path: "/eth/v1/node/version", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 61, kind: ConnectionRefused, message: "Connection refused" })) }) ``` ### After ``` HttpClient(url: http://localhost:9999/, kind: request, detail: error trying to connect: tcp connect error: Connection refused (os error 61)) ``` ## Additional Info I've also renamed the `Reqwest` error enum variants to `HttpClient`, to give people a better chance at knowing what's going on. Reqwest is pretty odd and looks like a typo. I've implemented it in the `eth2` and `execution_layer` crates. This should affect most logs in the VC and EE-related ones in the BN. I think the last crate that could benefit from the is the `beacon_node/eth1` crate. I haven't updated it in this PR since its error type is not so amenable to it (everything goes into a `String`). I don't have a whole lot of time to jig around with that at the moment and I feel that this PR as it stands is a significant enough improvement to merge on its own. Leaving it as-is is fine for the time being and we can always come back for it later (or implement in-protocol deposits!).	2023-06-27 01:06:50 +00:00
Pawan Dhananjay	448d3ec9b3	Aggregate subsets (#3493 ) ## Issue Addressed Resolves #3238 ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2023-06-27 01:06:49 +00:00
Mac L	c76afc6630	Remove legacy `max-skip-slots` checks (#4403 ) ## Proposed Changes Remove `max-skip-slots` checks when processing blocks. This was legacy code which was previously used in the Medalla testnet to sync to the correct fork. With the addition of checkpoint sync which allows us to sync to any arbitrary fork, this is no longer a necessary feature, so it has been removed for simplicity. ## Additional Notes The CLI flag and checks for attestation processing have been retained as it still may have uses in DoS protection.	2023-06-20 05:20:36 +00:00
Age Manning	6621e1d0c5	Improve ENR logic for ipv6 (#4395 ) Currently, the ENR of the node may not be correctly updated when specifying ipv6 fields through the CLI if an ENR exists on disk. This remedies a bug where we were not checking for ipv6 fields when comparing whether to use an on-disk ENR or updating based on CLI configuration parameters.	2023-06-19 23:53:25 +00:00
Michael Sproul	affea585f4	Remove `CountUnrealized` (#4357 ) ## Issue Addressed Closes #4332 ## Proposed Changes Remove the `CountUnrealized` type, defaulting unrealized justification to _on_. This fixes the #4332 issue by ensuring that importing the same block to fork choice always results in the same outcome. Finalized sync speed may be slightly impacted by this change, but that is deemed an acceptable trade-off until the optimisation from #4118 is implemented. TODO: - [x] Also check that the block isn't a duplicate before importing	2023-06-16 06:44:31 +00:00
Pawan Dhananjay	0ecca1dcb0	Rework internal rpc protocol handling (#4290 ) ## Issue Addressed Resolves #3980. Builds on work by @GeemoCandama in #4084 ## Proposed Changes Extends the `SupportedProtocol` abstraction added in Geemo's PR and attempts to fix internal versioning of requests that are mentioned in this comment https://github.com/sigp/lighthouse/pull/4084#issuecomment-1496380033 Co-authored-by: geemo <geemo@tutanota.com>	2023-06-14 05:08:50 +00:00
Divma	2639e67e90	Update discv5 to expand ipv6 support (#4319 ) Done in different PRs so that they can reviewed independently, as it's likely this won't be merged before I leave Includes resolution for #4080 - [ ] #4299 - [ ] #4318 - [ ] #4320 Co-authored-by: Diva M <divma@protonmail.com> Co-authored-by: Age Manning <Age@AgeManning.com>	2023-06-13 01:25:05 +00:00
Gua00va	62a2413ade	Enable slasher broadcast by default (#4368 ) ## Issue Addressed This PR addresses issue https://github.com/sigp/lighthouse/issues/4350 ## Proposed Changes This change will enable slasher broadcast in the following cases: No flag is passed, `--slasher-broadcast` is passed and, `--slasher-broadcast=true` is passed. Only when an explicit false value is passed the slasher does not broadcast.(`--slasher-broadcast=false`). ## Additional Info TODO - [x] Modify CLI parsing logic - [x] Write test Refer to #4353 Co-authored-by: Rahul Dogra <rahulcooldogra@gmail.com> Co-authored-by: Gua00va <105484243+Gua00va@users.noreply.github.com>	2023-06-08 13:47:56 +00:00
Michael Sproul	299cfe1fe6	Switch default slasher backend to LMDB (#4360 ) ## Issue Addressed Closes #4354 Closes #3987 Replaces #4305, #4283 ## Proposed Changes This switches the default slasher backend _back_ to LMDB. If an MDBX database exists and the MDBX backend is enabled then MDBX will continue to be used. Our release binaries and Docker images will continue to include MDBX for as long as it is practical, so users of these should not notice any difference. The main benefit is to users compiling from source and devs running tests. These users no longer have to struggle to compile MDBX and deal with the compatibility issues that arises. Similarly, devs don't need to worry about toggling feature flags in tests or risk forgetting to run the slasher tests due to backend issues.	2023-06-07 01:50:33 +00:00
Peter	b14d1493cc	Always log the value of relay and local blocks for comparison (#4352 ) ## Issue Addressed N/A ## Proposed Changes This change will log the value of the relay block and the local block when the relay block is more profitable. ## Additional Info This change will help validators understand the block selection (as it looks like the execution reward sometimes is higher that the MEV-reward). The rationale for this change is to aid operators to better understand why a relay-block was chosen over a local block. Looking at produced blocks (at beaconcha.in for example) it sometimes looks like the builder is making a profit just from the execution reward vs the MEV-reward, and creates the nagging question: "Could i have built this block and made that extra profit?"... The answer is probably "No, not without the extra transactions included by the relay", but by logging the value of the local block-candidate, this will no longer be an issue.. ### Example (Mainnet) https://beaconcha.in/block/17370329 MEV Block Reward: 0.17122 Ether to 0xE35bBaFa0266089f95d745d348b468622805D82B Execution Reward: 0.17528 Ether to 0x1f9090aaE28b8a3dCeaDf281B0F12828e676c326 Difference: 0.00406 Ether ### Examples (Goerli) https://goerli.beaconcha.in/block/9040065 MEV Block Reward: 0.56423 Ether to 0xF5794543CF6055Ae710E9c8E99E31343Cea004a8 Execution Reward: 0.56488 Ether to 0xfC0157aA4F5DB7177830ACddB3D5a9BB5BE9cc5e Difference: 0.00065 Ether https://goerli.beaconcha.in/block/9019921 MEV Block Reward: 1.39440 Ether to 0xF5794543CF6055Ae710E9c8E99E31343Cea004a8 Execution Reward: 1.39469 Ether to 0xfC0157aA4F5DB7177830ACddB3D5a9BB5BE9cc5e Difference: 0.00029 Ether https://goerli.beaconcha.in/block/9015583 MEV Block Reward: 1.04356 Ether to 0xF5794543CF6055Ae710E9c8E99E31343Cea004a8 Execution Reward: 1.04896 Ether to 0xfC0157aA4F5DB7177830ACddB3D5a9BB5BE9cc5e Difference: 0.0054 Ether	2023-06-07 01:50:31 +00:00
Paul Hauner	d07c78bccf	Appease clippy in Rust 1.70 (#4365 ) ## Issue Addressed NA ## Proposed Changes Fixes some new clippy lints raised after updating to Rust 1.70. ## Additional Info NA	2023-06-02 03:17:40 +00:00
Pawan Dhananjay	d399961e6e	Add an option to disable inbound rate limiter (#4327 ) ## Issue Addressed On deneb devnetv5, lighthouse keeps rate limiting peers which makes it harder to bootstrap new nodes as there are very few peers in the network. This PR adds an option to disable the inbound rate limiter for testnets. Added an option to configure inbound rate limits as well. Co-authored-by: Diva M <divma@protonmail.com>	2023-06-02 03:17:38 +00:00
Michael Sproul	04386cfabb	Expose execution block hash calculation (#4326 ) ## Proposed Changes This is a light refactor of the execution layer's block hash calculation logic making it easier to use externally. e.g. in `eleel` (https://github.com/sigp/eleel/pull/18). A static method is preferable to a method because the calculation doesn't actually need any data from `self`, and callers may want to compute block hashes without constructing an `ExecutionLayer` (`eleel` only constructs a simpler `Engine` struct).	2023-06-02 03:17:37 +00:00
Age Manning	fdea8f2b27	Shift subnet backbone structure (attnets revamp) (#4304 ) This PR address the following spec change: https://github.com/ethereum/consensus-specs/pull/3312 Instead of subscribing to a long-lived subnet for every attached validator to a beacon node, all beacon nodes will subscribe to `SUBNETS_PER_NODE` long-lived subnets. This is currently set to 2 for mainnet. This PR does not include any scoring or advanced discovery mechanisms. A future PR will improve discovery and we can implement scoring after the next hard fork when we expect all client teams and all implementations to respect this spec change. This will be a significant change in the subnet network structure for consensus clients and we will likely have to monitor and tweak our peer management logic.	2023-05-30 06:15:56 +00:00
Michael Sproul	baad729fa7	Fix Rust 1.71.0 warnings (#4348 ) ## Issue Addressed The Rust 1.70 release is imminent, so CI is using 1.71 for the Beta compiler, which is failing with a warning.	2023-05-30 01:38:51 +00:00
Eitan Seri-Levi	2a7e54d8bd	swap unnecessary write lock to read lock in block_verification (#4340 ) ## Issue Addressed [#4334](https://github.com/sigp/lighthouse/issues/4334) ## Proposed Changes swap unnecessary write lock to read lock ## Additional Info N/A Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-05-30 01:38:49 +00:00
Eitan Seri-Levi	744b1950e5	Keep payload cache idempotent (#4256 ) ## Issue Addressed [#4239](https://github.com/sigp/lighthouse/issues/4239) ## Proposed Changes keep the payload cache entry intact after fetching it ## Additional Info	2023-05-30 01:38:45 +00:00
Paul Hauner	c547a11b0d	v4.2.0 (#4309 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2023-05-23 00:17:10 +00:00
Age Manning	aa1ed787e9	Logging via the HTTP API (#4074 ) This PR adds the ability to read the Lighthouse logs from the HTTP API for both the BN and the VC. This is done in such a way to as minimize any kind of performance hit by adding this feature. The current design creates a tokio broadcast channel and mixes is into a form of slog drain that combines with our main global logger drain, only if the http api is enabled. The drain gets the logs, checks the log level and drops them if they are below INFO. If they are INFO or higher, it sends them via a broadcast channel only if there are users subscribed to the HTTP API channel. If not, it drops the logs. If there are more than one subscriber, the channel clones the log records and converts them to json in their independent HTTP API tasks. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2023-05-22 05:57:08 +00:00
Paul Hauner	01ae37ac37	Add more metrics for tracking sync messages (#4308 ) ## Issue Addressed NA ## Proposed Changes Adds metrics to track validators that are submitting equivocating (but not slashable) sync messages. This follows on from some research we've been doing in a separate fork of LH. ## Additional Info @jimmygchen and @michaelsproul have already run their eyes over this so it should be easy to get into v4.2.0, IMO.	2023-05-19 05:13:07 +00:00
Jimmy Chen	75aea7054c	Enshrine head state shuffling in the `shuffling_cache` (#4296 ) ## Issue Addressed #4281 ## Proposed Changes - Change `ShufflingCache` implementation from using `LruCache` to a custom cache that removes entry with lowest epoch instead of oldest insertion time. - Protect the "enshrined" head shufflings when inserting new committee cache entries. The shuffling ids matching the head's previous, current, and future epochs will never be ejected from the cache during `Self::insert_cache_item`. ## Additional Info There is a bonus point on shuffling preferences in the issue description that hasn't been implemented yet, as I haven't figured out a good way to do this: > However I'm not convinced since there are some complexities around tie-breaking when two entries have the same epoch. Perhaps preferring entries in the canonical chain is best? We should be able to check if a block is on the canonical chain by: ```rust canonical_head .fork_choice_read_lock() .contains_block(root) ``` However we need to interleave the shuffling and fork choice locks, which may cause deadlocks if we're not careful (mentioned by @paulhauner). Alternatively, we could use the `state.block_roots` field of the `chain.canonical_head.snapshot.beacon_state`, which avoids deadlock but requires more work. I'd like to get some feedback on review & testing before I dig deeper into the preferences stuff, as having the canonical head preference may already be quite useful in preventing the issue raised. Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>	2023-05-19 05:13:05 +00:00
Michael Sproul	3052db29fe	Implement `el_offline` and use it in the VC (#4295 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613. ## Proposed Changes - Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if: - The EL's internal status is `Offline` or `AuthFailed`, _or_ - The most recent call to `newPayload` resulted in an error (more on this in a moment). - Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes. - Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated. - Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check. ## Why track `newPayload` errors? Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because: - If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](`693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)`), which would have the effect of masking the failure and keeping the status _online_. - The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms). - If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them. ## Additional Changes - Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too. - Delete some stale comments from when we used to support multiple execution nodes.	2023-05-17 05:51:56 +00:00

1 2 3 4 5 ...

2296 Commits