lighthouse

Author	SHA1	Message	Date
Michael Sproul	5a35278aea	Add more checks and logging before genesis (#4730 ) ## Proposed Changes This PR adds more logging prior to genesis, particularly on networks that start with execution enabled. There are new checks using `eth_getBlockByHash/Number` to verify that the genesis state's `latest_execution_payload_header` matches the execution node's genesis block. The first commit also runs the merge-readiness/Capella-readiness checks prior to genesis. This has two effects: - Give more information on the execution node's status and its readiness for genesis. - Prevent the `el_offline` status from being set on `/eth/v1/node/syncing`, which previously caused the VC to complain loudly. I would like to include this for the Holesky reboot. It would have caught the misconfig that doomed the first Holesky. ## Additional Info - Geth doesn't serve payload bodies prior to genesis, which is why we use the legacy methods. I haven't checked with other ELs yet. - Currently this is logging errors with _Capella_ genesis states generated by `ethereum-genesis-generator` because the `withdrawals_root` is not set correctly (it is 0x0). This is not a blocker for Holesky, as it starts from Bellatrix (Pari is investigating).	2023-09-21 00:26:53 +00:00
Jimmy Chen	1e9925435e	Reuse fork choice read lock instead of re-acquiring it immediately (#4688 ) ## Issue Addressed I went through the code base and look for places where we acquire fork choice locks (after the deadlock bug was found and fixed in #4687), and discovered an instance where we re-acquire a lock immediately after dropping it. This shouldn't cause deadlock like the other issue, but is slightly less efficient.	2023-09-21 00:26:52 +00:00
Michael Sproul	4b6cb3db2c	Prevent port re-use in HTTP API tests (#4745 ) ## Issue Addressed CI is plagued by `AddrAlreadyInUse` failures, which are caused by race conditions in allocating free ports. This PR removes all usages of the `unused_port` crate for Lighthouse's HTTP API, in favour of passing `:0` as the listen address. As a result, the listen address isn't known ahead of time and must be read from the listening socket after it binds. This requires tying some self-referential knots, which is a little disruptive, but hopefully doesn't clash too much with Deneb 🤞 There are still a few usages of `unused_tcp4_port` left in cases where we start external processes, like the `watch` Postgres DB, Anvil, Geth, Nethermind, etc. Removing these usages is non-trivial because it's hard to read the port back from an external process after starting it with `--port 0`. We might be able to do something on Linux where we read from `/proc/`, but I'll leave that for future work.	2023-09-20 01:19:03 +00:00
Age Manning	e4ed317b76	Add Experimental QUIC support (#4577 ) ## Issue Addressed #4402 ## Proposed Changes This PR adds QUIC support to Lighthouse. As this is not officially spec'd this will only work between lighthouse <-> lighthouse connections. We attempt a QUIC connection (if the node advertises it) and if it fails we fallback to TCP. This should be a backwards compatible modification. We want to test this functionality on live networks to observe any improvements in bandwidth/latency. NOTE: This also removes the websockets transport as I believe no one is really using it. It should be mentioned in our release however. Co-authored-by: João Oliveira <hello@jxs.pt>	2023-09-15 03:07:24 +00:00
Jack McPherson	35f47f454f	Await listening address from libp2p in RPC tests setup (#4705 ) ## Issue Addressed #4704 ## Proposed Changes - Receive multiaddr from libp2p by awaiting listener setup ## Additional Info See also: #4675	2023-09-11 06:14:56 +00:00
Michael Sproul	2841f60686	Release v4.4.1 (#4690 ) ## Proposed Changes New release to replace the cancelled v4.4.0 release. This release includes the bugfix #4687 which avoids a deadlock that was present in v4.4.0. ## Additional Info Awaiting testing over the weekend this will be merged Monday September 4th.	2023-09-04 02:56:52 +00:00
Michael Sproul	74eb267643	Remove double-locking deadlock from HTTP API (#4687 ) ## Issue Addressed Fix a deadlock introduced in #4236 which was caught during the v4.4.0 release testing cycle (with thanks to @paulhauner and `gdb`). ## Proposed Changes Avoid re-locking the fork choice read lock when querying a state by root in the HTTP API. This avoids a deadlock due to the lock already being held. ## Additional Info The [RwLock docs](https://docs.rs/lock_api/latest/lock_api/struct.RwLock.html#method.read) explicitly advise against re-locking: > Note that attempts to recursively acquire a read lock on a RwLock when the current thread already holds one may result in a deadlock.	2023-08-31 11:18:00 +00:00
Paul Hauner	e99ba3a14e	Release v4.4.0 (#4673 ) ## Issue Addressed NA ## Proposed Changes Bump versions from `v4.3.0` to `v4.4.0`. ## Additional Info NA	2023-08-31 02:12:35 +00:00
Michael Sproul	f284e0e264	Fix bug in block root storage (#4663 ) ## Issue Addressed Fix a bug in the storage of the linear block roots array in the freezer DB. Previously this array was always written as part of state storage (or block backfill). With state pruning enabled by #4610, these states were no longer being written and as a result neither were the block roots. The impact is quite low, we would just log an error when trying to forwards-iterate the block roots, which for validating nodes only happens when they try to look up blocks for peers: > Aug 25 03:42:36.980 ERRO Missing chunk in forwards iterator chunk index: 49726, service: freezer_db Any node checkpoint synced off `unstable` is affected and has a corrupt database. If you see the log above, you need to re-sync with the fix. Nodes that haven't checkpoint synced recently should _not_ be corrupted, even if they ran the buggy version. ## Proposed Changes - Use a `ChunkWriter` to write the block roots when states are not being stored. - Tweak the usage of `get_latest_restore_point` so that it doesn't return a nonsense value when state pruning is enabled. - Tweak the guarantee on the block roots array so that block roots are assumed available up to the split slot (exclusive). This is a bit nicer than relying on anything to do with the latest restore point, which is a nonsensical concept when there aren't any restore points. ## Additional Info I'm looking forward to deleting the chunked vector code for good when we merge tree-states 😁	2023-08-28 05:34:28 +00:00
Paul Hauner	d61f507184	Add Holesky (#4653 ) ## Issue Addressed NA ## Proposed Changes Add the Holesky network config as per `36e4ff2d51/custom_config_data`. Since the genesis state is ~190MB, I've opted to not include it in the binary and instead download it at runtime (see #4564 for context). To download this file we have: - A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs) - If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket. - If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag. - Whenever a genesis state is downloaded it is checked against a checksum baked into the binary. - A genesis state will never be downloaded if it's already included in the binary. - There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file. ## Log Output Example of log output when a state is downloaded: ```bash Aug 23 05:40:13.424 INFO Logging to file path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log" Aug 23 05:40:13.425 INFO Lighthouse started version: Lighthouse/v4.3.0-bd9931f+ Aug 23 05:40:13.425 INFO Configured for network name: holesky Aug 23 05:40:13.426 INFO Data directory initialised datadir: /Users/paul/.lighthouse/holesky Aug 23 05:40:13.427 INFO Deposit contract address: 0x4242424242424242424242424242424242424242, deploy_block: 0 Aug 23 05:40:13.427 INFO Downloading genesis state info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/ Aug 23 05:40:29.895 INFO Starting from known genesis state service: beacon ``` Example of log output when there are no URLs specified: ``` Aug 23 06:29:51.645 INFO Logging to file path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log" Aug 23 06:29:51.646 INFO Lighthouse started version: Lighthouse/v4.3.0-666a39c+ Aug 23 06:29:51.646 INFO Configured for network name: goerli Aug 23 06:29:51.647 INFO Data directory initialised datadir: /Users/paul/.lighthouse/goerli Aug 23 06:29:51.647 INFO Deposit contract address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322 The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url. ``` ## Additional Info I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there. I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states. When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not entirely pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?	2023-08-28 05:34:27 +00:00
Mac L	55e02e7c3f	Show `--gui` flag in help text (#4660 ) ## Issue Addressed N/A ## Proposed Changes Remove the `hidden(true)` modifier on the `--gui` flag so it shows up when running `lighthouse bn --help` ## Additional Info We need to include this now that Siren has had its first stable release.	2023-08-28 00:55:33 +00:00
Jimmy Chen	9c24cd4ad4	Do not log slot clock error prior to genesis (#4657 ) ## Issue Addressed #4654 ## Proposed Changes Only log error if we're unable to read slot clock after genesis. I thought about simply down grading the `error` to a `warn`, but feel like it's still unnecessary noise before genesis, and it would be good to retain error log if we're pass genesis. But I'd be ok with just downgrading the log level, too.	2023-08-28 00:55:32 +00:00
Michael Sproul	8e95b69a1a	Send success code for duplicate blocks on HTTP (#4655 ) ## Issue Addressed Closes #4473 (take 3) ## Proposed Changes - Send a 202 status code by default for duplicate blocks, instead of 400. This conveys to the caller that the block was published, but makes no guarantees about its validity. Block relays can count this as a success or a failure as they wish. - For users wanting finer-grained control over which status is returned for duplicates, a flag `--http-duplicate-block-status` can be used to adjust the behaviour. A 400 status can be supplied to restore the old (spec-compliant) behaviour, or a 200 status can be used to silence VCs that warn loudly for non-200 codes (e.g. Lighthouse prior to v4.4.0). - Update the Lighthouse VC to gracefully handle success codes other than 200. The info message isn't the nicest thing to read, but it covers all bases and isn't a nasty `ERRO`/`CRIT` that will wake anyone up. ## Additional Info I'm planning to raise a PR to `beacon-APIs` to specify that clients may return 202 for duplicate blocks. Really it would be nice to use some 2xx code that _isn't_ the same as the code for "published but invalid". I think unfortunately there aren't any suitable codes, and maybe the best fit is `409 CONFLICT`. Given that we need to fix this promptly for our release, I think using the 202 code temporarily with configuration strikes a nice compromise.	2023-08-28 00:55:31 +00:00
João Oliveira	c258270d6a	update dependencies (#4639 ) ## Issue Addressed updates underlying dependencies and removes the ignored `RUSTSEC`'s for `cargo audit`. Also switches `procinfo` to `procfs` on `eth2` to remove the `nom` warning, `procinfo` is unmaintained see [here](https://github.com/danburkert/procinfo-rs/issues/46).	2023-08-28 00:55:28 +00:00
realbigsean	14924dbc95	rust 1.72 lints (#4659 )	2023-08-24 14:33:24 -04:00
Pawan Dhananjay	ea43b6a53c	Revive mplex (#4619 ) ## Issue Addressed N/A ## Proposed Changes In #4431 , we seem to have removed support for mplex as it is being deprecated in libp2p. See https://github.com/libp2p/specs/issues/553 . Related rust-libp2p PR https://github.com/libp2p/rust-libp2p/pull/3920 However, since this isn't part of the official [consensus specs](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#multiplexing), we still need to support mplex. > Clients MUST support [mplex](https://github.com/libp2p/specs/tree/master/mplex) and MAY support [yamux](https://github.com/hashicorp/yamux/blob/master/spec.md). This PR adds back mplex support as before.	2023-08-24 05:54:37 +00:00
Eitan Seri-Levi	661779f08e	Implement expected withdrawals endpoint (#4390 ) ## Issue Addressed [#4029](https://github.com/sigp/lighthouse/issues/4029) ## Proposed Changes implement expected_withdrawals HTTP API per the spec https://github.com/ethereum/beacon-APIs/pull/304 ## Additional Info	2023-08-24 05:54:36 +00:00
Michael Sproul	524d9af288	Fix beacon-processor-max-workers (#4636 ) ## Issue Addressed Fixes a bug in the handling of `--beacon-process-max-workers` which caused it to have no effect. ## Proposed Changes For this PR I channeled @ethDreamer and saw deep into the faulty CLI config -- this bug is almost identical to the one Mark found and fixed in #4622.	2023-08-21 05:02:34 +00:00
Michael Sproul	20067b9465	Remove checkpoint alignment requirements and enable historic state pruning (#4610 ) ## Issue Addressed Closes #3210 Closes #3211 ## Proposed Changes - Checkpoint sync from the latest finalized state regardless of its alignment. - Add the `block_root` to the database's split point. This is _only_ added to the in-memory split in order to avoid a schema migration. See `load_split`. - Add a new method to the DB called `get_advanced_state`, which looks up a state _by block root_, with a `state_root` as fallback. Using this method prevents accidental accesses of the split's unadvanced state, which does not exist in the hot DB and is not guaranteed to exist in the freezer DB at all. Previously Lighthouse would look up this state _from the freezer DB_, even if it was required for block/attestation processing, which was suboptimal. - Replace several state look-ups in block and attestation processing with `get_advanced_state` so that they can't hit the split block's unadvanced state. - Do not store any states in the freezer database by default. All states will be deleted upon being evicted from the hot database unless `--reconstruct-historic-states` is set. The anchor info which was previously used for checkpoint sync is used to implement this, including when syncing from genesis. ## Additional Info Needs further testing. I want to stress-test the pruned database under Hydra. The `get_advanced_state` method is intended to become more relevant over time: `tree-states` includes an identically named method that returns advanced states from its in-memory cache. Co-authored-by: realbigsean <seananderson33@gmail.com>	2023-08-21 05:02:32 +00:00
ethDreamer	687c58fde0	Fix Prefer Builder Flag (#4622 )	2023-08-18 03:22:27 +00:00
zhiqiangxu	609819bb4d	`attester_duties`: remove unnecessary case (#4614 ) Since `tolerant_current_epoch` is expected to be either `current_epoch` or `current_epoch+1`, we can eliminate a case here. And added a comment about `compute_historic_attester_duties` , since `RelativeEpoch::from_epoch` will only allow `request_epoch == current_epoch-1` when `request_epoch < current_epoch`.	2023-08-17 02:37:30 +00:00
Michael Sproul	7251a93c5e	Don't kill SSE stream if channel fills up (#4500 ) ## Issue Addressed Closes #4245 ## Proposed Changes - If an SSE channel fills up, send a comment instead of terminating the stream. - Add a CLI flag for scaling up the SSE buffer: `--http-sse-capacity-multiplier N`. ## Additional Info ~~Blocked on #4462. I haven't rebased on that PR yet for initial testing, because it still needs some more work to handle long-running HTTP threads.~~ - [x] Add CLI flag tests.	2023-08-17 02:37:29 +00:00
Jimmy Chen	59c24bcd2d	Fix disable backfill flag not working correctly (#4615 ) ## Issue Addressed The feature flag used to control this feature is `disable_backfill` instead of `disable-backfill`. kudos to @michaelsproul for discovering this bug!	2023-08-14 06:08:34 +00:00
Michael Sproul	249f85f1d9	Improve HTTP API error messages + tweaks (#4595 ) ## Issue Addressed Closes #3404 (mostly) ## Proposed Changes - Remove all uses of Warp's `and_then` (which backtracks) in favour of `then` (which doesn't). - Bump the priority of the `POST` method for `v2/blocks` to `P0`. Publishing a block needs to happen quickly. - Run the new SSZ POST endpoints on the beacon processor. I think this was missed in between merging #4462 and #4504/#4479. - Fix a minor issue in the validator registrations endpoint whereby an error from spawning the task on the beacon processor would be dropped. ## Additional Info I've tested this manually and can confirm that we no longer get the dreaded `Unsupported endpoint version` errors for queries like: ``` $ curl -X POST -H "Content-Type: application/json" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" \| jq { "code": 400, "message": "BAD_REQUEST: WeakSubjectivityConflict", "stacktraces": [] } ``` ``` $ curl -X POST -H "Content-Type: application/octet-stream" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" \| jq { "code": 400, "message": "BAD_REQUEST: invalid SSZ: OffsetOutOfBounds(572530811)", "stacktraces": [] } ``` ``` $ curl "http://localhost:5052/eth/v2/validator/blocks/7067595" {"code":400,"message":"BAD_REQUEST: invalid query: Invalid query string","stacktraces":[]} ``` However, I can still trigger it by leaving off the `Content-Type`. We can re-test this aspect with #4575.	2023-08-14 04:06:37 +00:00
zhiqiangxu	f1ac12f23a	Fix some typos (#4565 )	2023-08-14 00:29:43 +00:00
Eitan Seri-Levi	1fcada8a32	Improve transport connection errors (#4540 ) ## Issue Addressed #4538 ## Proposed Changes add newtype wrapper around DialError that extracts error messages and logs them in a more readable format ## Additional Info I was able to test Transport Dial Errors in the situation where a libp2p instance attempts to ping a nonexistent peer. That error message should look something like `A transport level error has ocurred: Connection refused (os error 61)` AgeManning mentioned we should try fetching only the most inner error (in situations where theres a nested error). I took a stab at implementing that For non transport DialErrors, I wrote out the error messages explicitly (as per the docs). Could potentially clean things up here if thats not necessary Co-authored-by: Age Manning <Age@AgeManning.com>	2023-08-10 00:10:09 +00:00
Paul Hauner	b60304b19f	Use `BeaconProcessor` for API requests (#4462 ) ## Issue Addressed NA ## Proposed Changes Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` ## New CLI Flags I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. ## Additional Info I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](`efbabe3252`). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.	2023-08-08 23:30:15 +00:00
Eitan Seri-Levi	521432129d	Support SSZ request body for POST /beacon/blinded_blocks endpoints (v1 & v2) (#4504 ) ## Issue Addressed #4262 ## Proposed Changes add SSZ support in request body for POST /beacon/blinded_blocks endpoints (v1 & v2) ## Additional Info	2023-08-07 22:53:04 +00:00
Armağan Yıldırak	3397612160	Shift networking configuration (#4426 ) ## Issue Addressed Addresses [#4401](https://github.com/sigp/lighthouse/issues/4401) ## Proposed Changes Shift some constants into ```ChainSpec``` and remove the constant values from code space. ## Additional Info I mostly used ```MainnetEthSpec::default_spec()``` for getting ```ChainSpec```. I wonder Did I make a mistake about that. Co-authored-by: armaganyildirak <armaganyildirak@gmail.com> Co-authored-by: Paul Hauner <paul@paulhauner.com> Co-authored-by: Age Manning <Age@AgeManning.com> Co-authored-by: Diva M <divma@protonmail.com>	2023-08-03 01:51:47 +00:00
zhiqiangxu	fcf51d691e	fix typo (#4555 )	2023-08-02 23:50:41 +00:00
Divma	ff9b09d964	upgrade to libp2p 0.52 (#4431 ) ## Issue Addressed Upgrade libp2p to v0.52 ## Proposed Changes - Workflows: remove installation of `protoc` - Book: remove installation of `protoc` - `Dockerfile`s and `cross`: remove custom base `Dockerfile` for cross since it's no longer needed. Remove `protoc` from remaining `Dockerfiles`s - Upgrade `discv5` to `v0.3.1`: we have some cool stuff in there: no longer needs `protoc` and faster ip updates on cold start - Upgrade `prometheus` to `0.21.0`, now it no longer needs encoding checks - things that look like refactors: bunch of api types were renamed and need to be accessed in a different (clearer) way - Lighthouse network - connection limits is now a behaviour - banned peers no longer exist on the swarm level, but at the behaviour level - `connection_event_buffer_size` now is handled per connection with a buffer size of 4 - `mplex` is deprecated and was removed - rpc handler now logs the peer to which it belongs ## Additional Info Tried to keep as much behaviour unchanged as possible. However, there is a great deal of improvements we can do _after_ this upgrade: - Smart connection limits: Connection limits have been checked only based on numbers, we can now use information about the incoming peer to decide if we want it - More powerful peer management: Dial attempts from other behaviours can be rejected early - Incoming connections can be rejected early - Banning can be returned exclusively to the peer management: We should not get connections to banned peers anymore making use of this - TCP Nat updates: We might be able to take advantage of confirmed external addresses to check out tcp ports/ips Co-authored-by: Age Manning <Age@AgeManning.com> Co-authored-by: Akihito Nakano <sora.akatsuki@gmail.com>	2023-08-02 00:59:34 +00:00
Gua00va	73764d0dd2	Deprecate `exchangeTransitionConfiguration` functionality (#4517 ) ## Issue Addressed Solves #4442 ## Proposed Changes EL clients log errors if we don't query this endpoint, but they are making releases that remove this error logging. After those are out we can stop calling it, after which point EL teams will remove the endpoint entirely. Refer https://hackmd.io/@n0ble/deprecate-exchgTC	2023-07-31 23:51:39 +00:00
Eitan Seri-Levi	e8c411c288	add ssz support in request body for /beacon/blocks endpoints (v1 & v2) (#4479 ) ## Issue Addressed [#4457](https://github.com/sigp/lighthouse/issues/4457) ## Proposed Changes add ssz support in request body for /beacon/blocks endpoints (v1 & v2) ## Additional Info	2023-07-31 23:51:37 +00:00
Age Manning	8654f20028	Development feature flag - Disable backfill (#4537 ) Often when testing I have to create a hack which is annoying to maintain. I think it might be handy to add a custom compile-time flag that developers can use if they want to test things locally without having to backfill a bunch of blocks. There is probably an argument to have a feature called "backfill" which is enabled by default and can be disabled. I didn't go this route because I think it's counter-intuitive to have a feature that enables a core and necessary behaviour.	2023-07-31 01:53:08 +00:00
Gua00va	117802cef1	Add Eth Version Header (#4528 ) ## Issue Addressed Closes #4525 ## Proposed Changes `GET /eth/v1/validator/blinded_blocks` endpoint and `GET /eth/v1/validator/blocks` now send `Eth-Version` header. Co-authored-by: Gua00va <105484243+Gua00va@users.noreply.github.com>	2023-07-31 01:53:07 +00:00
Jimmy Chen	b5337c0ea5	Fix incorrect ideal rewards calculation (#4520 ) ## Issue Addressed The PR fixes a bug where the the ideal rewards for source and head were incorrectly set. Output from testing a validator that performed optimally in a Phase 0 epoch , note the `source` and `target` under ideal rewards is incorrect (compared to the actual `total_rewards` below): ```json { "ideal_rewards": [ ... { "effective_balance": "32000000000", "head": "18771", "target": "18770", "source": "18729", "inclusion_delay": "17083", "inactivity": "0" } ], "total_rewards": [ { "validator_index": "0", "head": "18729", "target": "18770", "source": "18771", "inclusion_delay": "17083", "inactivity": "0" } ] ```	2023-07-31 01:53:06 +00:00
Aoi Kurokawa	85a3340d0e	Implement liveness BeaconAPI (#4343 ) ## Issue Addressed #4243 ## Proposed Changes - create a new endpoint for liveness/{endpoint} ## Additional Info This is my first PR.	2023-07-31 01:53:03 +00:00
Jimmy Chen	fc7f1ba6b9	Phase 0 attestation rewards via Beacon API (#4474 ) ## Issue Addressed Addresses #4026. Beacon-API spec [here](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Beacon/getAttestationsRewards). Endpoint: `POST /eth/v1/beacon/rewards/attestations/{epoch}` This endpoint already supports post-Altair epochs. This PR adds support for phase 0 rewards calculation. ## Proposed Changes - [x] Attestation rewards API to support phase 0 rewards calculation, re-using logic from `state_processing`. Refactored `get_attestation_deltas` slightly to support computing deltas for a subset of validators. - [x] Add `inclusion_delay` to `ideal_rewards` (`beacon-API` spec update to follow) - [x] Add `inactivity` penalties to both `ideal_rewards` and `total_rewards` (`beacon-API` spec update to follow) - [x] Add tests to compute attestation rewards and compare results with beacon states ## Additional Notes - The extra penalty for missing attestations or being slashed during an inactivity leak is currently not included in the API response (for both phase 0 and Altair) in the spec. - I went with adding `inactivity` as a separate component rather than combining them with the 4 rewards, because this is how it was grouped in [the phase 0 spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md#get_attestation_deltas). During inactivity leak, all rewards include the optimal reward, and inactivity penalties are calculated separately (see below code snippet from the spec), so it would be quite confusing if we merge them. This would also work better with Altair, because there's no "cancelling" of rewards and inactivity penalties are more separate. - Altair calculation logic (to include inactivity penalties) to be updated in a follow-up PR. ```python def get_attestation_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]: """ Return attestation reward/penalty deltas for each validator. """ source_rewards, source_penalties = get_source_deltas(state) target_rewards, target_penalties = get_target_deltas(state) head_rewards, head_penalties = get_head_deltas(state) inclusion_delay_rewards, _ = get_inclusion_delay_deltas(state) _, inactivity_penalties = get_inactivity_penalty_deltas(state) rewards = [ source_rewards[i] + target_rewards[i] + head_rewards[i] + inclusion_delay_rewards[i] for i in range(len(state.validators)) ] penalties = [ source_penalties[i] + target_penalties[i] + head_penalties[i] + inactivity_penalties[i] for i in range(len(state.validators)) ] return rewards, penalties ``` ## Example API Response <details> <summary>Click me</summary> ```json { "ideal_rewards": [ { "effective_balance": "1000000000", "head": "6638", "target": "6638", "source": "6638", "inclusion_delay": "9783", "inactivity": "0" }, { "effective_balance": "2000000000", "head": "13276", "target": "13276", "source": "13276", "inclusion_delay": "19565", "inactivity": "0" }, { "effective_balance": "3000000000", "head": "19914", "target": "19914", "source": "19914", "inclusion_delay": "29349", "inactivity": "0" }, { "effective_balance": "4000000000", "head": "26553", "target": "26553", "source": "26553", "inclusion_delay": "39131", "inactivity": "0" }, { "effective_balance": "5000000000", "head": "33191", "target": "33191", "source": "33191", "inclusion_delay": "48914", "inactivity": "0" }, { "effective_balance": "6000000000", "head": "39829", "target": "39829", "source": "39829", "inclusion_delay": "58697", "inactivity": "0" }, { "effective_balance": "7000000000", "head": "46468", "target": "46468", "source": "46468", "inclusion_delay": "68480", "inactivity": "0" }, { "effective_balance": "8000000000", "head": "53106", "target": "53106", "source": "53106", "inclusion_delay": "78262", "inactivity": "0" }, { "effective_balance": "9000000000", "head": "59744", "target": "59744", "source": "59744", "inclusion_delay": "88046", "inactivity": "0" }, { "effective_balance": "10000000000", "head": "66383", "target": "66383", "source": "66383", "inclusion_delay": "97828", "inactivity": "0" }, { "effective_balance": "11000000000", "head": "73021", "target": "73021", "source": "73021", "inclusion_delay": "107611", "inactivity": "0" }, { "effective_balance": "12000000000", "head": "79659", "target": "79659", "source": "79659", "inclusion_delay": "117394", "inactivity": "0" }, { "effective_balance": "13000000000", "head": "86298", "target": "86298", "source": "86298", "inclusion_delay": "127176", "inactivity": "0" }, { "effective_balance": "14000000000", "head": "92936", "target": "92936", "source": "92936", "inclusion_delay": "136959", "inactivity": "0" }, { "effective_balance": "15000000000", "head": "99574", "target": "99574", "source": "99574", "inclusion_delay": "146742", "inactivity": "0" }, { "effective_balance": "16000000000", "head": "106212", "target": "106212", "source": "106212", "inclusion_delay": "156525", "inactivity": "0" }, { "effective_balance": "17000000000", "head": "112851", "target": "112851", "source": "112851", "inclusion_delay": "166307", "inactivity": "0" }, { "effective_balance": "18000000000", "head": "119489", "target": "119489", "source": "119489", "inclusion_delay": "176091", "inactivity": "0" }, { "effective_balance": "19000000000", "head": "126127", "target": "126127", "source": "126127", "inclusion_delay": "185873", "inactivity": "0" }, { "effective_balance": "20000000000", "head": "132766", "target": "132766", "source": "132766", "inclusion_delay": "195656", "inactivity": "0" }, { "effective_balance": "21000000000", "head": "139404", "target": "139404", "source": "139404", "inclusion_delay": "205439", "inactivity": "0" }, { "effective_balance": "22000000000", "head": "146042", "target": "146042", "source": "146042", "inclusion_delay": "215222", "inactivity": "0" }, { "effective_balance": "23000000000", "head": "152681", "target": "152681", "source": "152681", "inclusion_delay": "225004", "inactivity": "0" }, { "effective_balance": "24000000000", "head": "159319", "target": "159319", "source": "159319", "inclusion_delay": "234787", "inactivity": "0" }, { "effective_balance": "25000000000", "head": "165957", "target": "165957", "source": "165957", "inclusion_delay": "244570", "inactivity": "0" }, { "effective_balance": "26000000000", "head": "172596", "target": "172596", "source": "172596", "inclusion_delay": "254352", "inactivity": "0" }, { "effective_balance": "27000000000", "head": "179234", "target": "179234", "source": "179234", "inclusion_delay": "264136", "inactivity": "0" }, { "effective_balance": "28000000000", "head": "185872", "target": "185872", "source": "185872", "inclusion_delay": "273918", "inactivity": "0" }, { "effective_balance": "29000000000", "head": "192510", "target": "192510", "source": "192510", "inclusion_delay": "283701", "inactivity": "0" }, { "effective_balance": "30000000000", "head": "199149", "target": "199149", "source": "199149", "inclusion_delay": "293484", "inactivity": "0" }, { "effective_balance": "31000000000", "head": "205787", "target": "205787", "source": "205787", "inclusion_delay": "303267", "inactivity": "0" }, { "effective_balance": "32000000000", "head": "212426", "target": "212426", "source": "212426", "inclusion_delay": "313050", "inactivity": "0" } ], "total_rewards": [ { "validator_index": "0", "head": "212426", "target": "212426", "source": "212426", "inclusion_delay": "313050", "inactivity": "0" }, { "validator_index": "32", "head": "212426", "target": "212426", "source": "212426", "inclusion_delay": "313050", "inactivity": "0" }, { "validator_index": "63", "head": "-357771", "target": "-357771", "source": "-357771", "inclusion_delay": "0", "inactivity": "0" } ] } ``` </details>	2023-07-18 01:48:40 +00:00
Divma	4435a22221	Cleanup unreachable code in `lcli::generate_bootnode_enr` and some tests (#4485 ) ## Issue Addressed n/a Noticed this while working on something else ## Proposed Changes - leverage the appropriate types to avoid a bunch of `unwrap` and errors ## Additional Info n/a	2023-07-17 05:31:53 +00:00
Pawan Dhananjay	f2223feb21	Rust 1.71 lints (#4503 ) ## Issue Addressed N/A ## Proposed Changes Add lints for rust 1.71 [3789134](`3789134ae2`) is probably the one that needs most attention as it changes beacon state code. I changed the `is_in_inactivity_leak ` function to return a `ArithError` as not all consumers of that function work well with a `BeaconState::Error`.	2023-07-17 00:14:19 +00:00
Michael Sproul	03674c7199	Update mev-rs and remove patches (#4496 ) ## Issue Addressed Fixes occasional compilation errors with mev-rs (see #4456). ## Proposed Changes - Update `mev-rs` to the latest version, which allows us to remove hacky `[patch]` sections - Update the `axum` version used in `watch` so LH only uses a single version	2023-07-17 00:14:15 +00:00
Michael Sproul	6c375205fb	Fix HTTP state API bug and add `--epochs-per-migration` (#4236 ) ## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.	2023-07-17 00:14:12 +00:00
Jack McPherson	62c9170755	Remove hidden re-exports to appease Rust 1.73 (#4495 ) ## Issue Addressed #4494 ## Proposed Changes - Remove explicit re-exports of various types to appease the new compiler lint ## Additional Info It seems `warn(hidden_glob_reexports)` is the main culprit.	2023-07-12 07:06:00 +00:00
Paul Hauner	c25825a539	Move the `BeaconProcessor` into a new crate (#4435 ) Replaces #4434. It is identical, but this PR has a smaller diff due to a curated commit history. ## Issue Addressed NA ## Proposed Changes This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate. This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`. The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for: 1. HTTP API requests. 1. Scheduled tasks in the `BeaconChain` (e.g., state advance). Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests). ## Additional Info This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the scheduling code (i.e., the `BeaconProcessor`) from the business logic in the `network` crate (i.e., the `Worker` impls). Future PRs (see #4462) can build upon these works to actually use the `BeaconProcessor` for more operations. I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.	2023-07-10 07:45:54 +00:00
Michael Sproul	ea2420d193	Bump default checkpoint sync timeout to 3 minutes (#4466 ) ## Issue Addressed [Users on Twitter](https://twitter.com/ashekhirin/status/1676334843192397824) are getting checkpoint sync URL timeouts with the default of 60s, so this PR increases the default timeout to 3 minutes. I've also added a short section to the book about adjusting the timeout with `--checkpoint-sync-url-timeout`.	2023-07-08 13:16:06 +00:00
Jack McPherson	a6d5c7d7e0	Correct checks for backfill completeness (#4465 ) ## Issue Addressed #4331 ## Proposed Changes - Use comparison rather than strict equality between the earliest epoch we know about and the backfill target (which will be the most recent WSP by default or genesis) - Add helper function `BackFillSync<T>::would_complete` to achieve this in one location ## Additional Info - There's an ad hoc test for this in #4461 Co-authored-by: Age Manning <Age@AgeManning.com>	2023-07-06 07:35:31 +00:00
Paul Hauner	dfcb3363c7	Release v4.3.0 (#4452 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2023-07-04 13:29:55 +00:00
Jimmy Chen	46be05f728	Cache target attester balances for unrealized FFG progression calculation (#4362 ) ## Issue Addressed #4118 ## Proposed Changes This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive). This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled\|checked\|strict\|fast` flag is introduced to support this: - `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. This is the default mode for now. - `strict`: enabled with checks against participation cache, returns error if there is a mismatch. Used for testing only. - `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release. - `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs. ### Tasks - [x] Initial cache implementation in `BeaconState` - [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache` - [x] Add CLI flag, and disable the optimization by default - [x] Testing on Goerli & Benchmarking - [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](https://github.com/sigp/lighthouse/pull/4362#discussion_r1230877001)) - [x] Add attesting balance metrics Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>	2023-06-30 01:13:06 +00:00
Eitan Seri-Levi	826e090f50	Update node health endpoint (#4310 ) ## Issue Addressed [#4292](https://github.com/sigp/lighthouse/issues/4292) ## Proposed Changes Updated the node health endpoint will return a 200 status code if `!syncing && !el_offline && !optimistic` wil return a 206 if `(syncing \|\| optimistic) && !el_offline` will return a 503 if `el_offline` ## Additional Info	2023-06-30 01:13:04 +00:00
Eitan Seri-Levi	edd093293a	added debounce to log (#4269 ) ## Issue Addressed [#4259](https://github.com/sigp/lighthouse/issues/4259) ## Proposed Changes debounce spammy `Unable to send message to the beacon processor` log messages ## Additional Info We could potentially debounce other logs that have the potential to be "spammy". After some feedback we decided to additionally add the following change: create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.	2023-06-30 01:13:03 +00:00

1 2 3 4 5 ...

2320 Commits