lighthouse

Author	SHA1	Message	Date
Paul Hauner	3128b5b430	v3.1.1 (#3585 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info - ~~Requires additional testing~~ - ~~Blocked on:~~ - ~~#3589~~ - ~~#3540~~ - ~~#3587~~	2022-09-22 06:08:52 +00:00
Paul Hauner	96692b8e43	Impl `oneshot_broadcast` for committee promises (#3595 ) ## Issue Addressed NA ## Proposed Changes Fixes an issue introduced in #3574 where I erroneously assumed that a `crossbeam_channel` multiple receiver queue was a broadcast queue. This is incorrect, each message will be received by only one receiver. The effect of this mistake is these logs: ``` Sep 20 06:56:17.001 INFO Synced slot: 4736079, block: 0xaa8a…180d, epoch: 148002, finalized_epoch: 148000, finalized_root: 0x2775…47f2, exec_hash: 0x2ca5…ffde (verified), peers: 6, service: slot_notifier Sep 20 06:56:23.237 ERRO Unable to validate attestation error: CommitteeCacheWait(RecvError), peer_id: 16Uiu2HAm2Jnnj8868tb7hCta1rmkXUf5YjqUH1YPj35DCwNyeEzs, type: "aggregated", slot: Slot(4736047), beacon_block_root: 0x88d318534b1010e0ebd79aed60b6b6da1d70357d72b271c01adf55c2b46206c1 ``` ## Additional Info NA	2022-09-21 01:01:50 +00:00
Paul Hauner	2cd3e3a768	Avoid duplicate committee cache loads (#3574 ) ## Issue Addressed NA ## Proposed Changes I have observed scenarios on Goerli where Lighthouse was receiving attestations which reference the same, un-cached shuffling on multiple threads at the same time. Lighthouse was then loading the same state from database and determining the shuffling on multiple threads at the same time. This is unnecessary load on the disk and RAM. This PR modifies the shuffling cache so that each entry can be either: - A committee - A promise for a committee (i.e., a `crossbeam_channel::Receiver`) Now, in the scenario where we have thread A and thread B simultaneously requesting the same un-cached shuffling, we will have the following: 1. Thread A will take the write-lock on the shuffling cache, find that there's no cached committee and then create a "promise" (a `crossbeam_channel::Sender`) for a committee before dropping the write-lock. 1. Thread B will then be allowed to take the write-lock for the shuffling cache and find the promise created by thread A. It will block the current thread waiting for thread A to fulfill that promise. 1. Thread A will load the state from disk, obtain the shuffling, send it down the channel, insert the entry into the cache and then continue to verify the attestation. 1. Thread B will then receive the shuffling from the receiver, be un-blocked and then continue to verify the attestation. In the case where thread A fails to generate the shuffling and drops the sender, the next time that specific shuffling is requested we will detect that the channel is disconnected and return a `None` entry for that shuffling. This will cause the shuffling to be re-calculated. ## Additional Info NA	2022-09-16 08:54:03 +00:00
Michael Sproul	cd31e54b99	Bump `axum` deps (#3570 ) ## Issue Addressed Fix a `cargo-audit` failure. We don't use `axum` for anything besides tests, but `cargo-audit` is failing due to this vulnerability in `axum-core`: https://rustsec.org/advisories/RUSTSEC-2022-0055	2022-09-13 01:57:47 +00:00
realbigsean	d1a8d6cf91	Pin mev rs deps (#3557 ) ## Issue Addressed We were unable to update lighthouse by running `cargo update` because some of the `mev-build-rs` deps weren't pinned. But `mev-build-rs` is now pinned here and includes it's own pinned commits for `ssz-rs` and `etheruem-consensus` Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-09-08 23:46:03 +00:00
Divma	473abc14ca	Subscribe to subnets only when needed (#3419 ) ## Issue Addressed We currently subscribe to attestation subnets as soon as the subscription arrives (one epoch in advance), this makes it so that subscriptions for future slots are scheduled instead of done immediately. ## Proposed Changes - Schedule subscriptions to subnets for future slots. - Finish removing hashmap_delay, in favor of [delay_map](https://github.com/AgeManning/delay_map). This was the only remaining service to do this. - Subscriptions for past slots are rejected, before we would subscribe for one slot. - Add a new test for subscriptions that are not consecutive. ## Additional Info This is also an effort in making the code easier to understand	2022-09-05 00:22:48 +00:00
Paul Hauner	aa022f4685	v3.1.0 (#3525 ) ## Issue Addressed NA ## Proposed Changes - Bump versions ## Additional Info - ~~Blocked on #3508~~ - ~~Blocked on #3526~~ - ~~Requires additional testing.~~ - Expected release date is 2022-09-01	2022-08-31 22:21:55 +00:00
Paul Hauner	8609cced0e	Reset payload statuses when resuming fork choice (#3498 ) ## Issue Addressed NA ## Proposed Changes This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot. The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around. ## Additional Info This implementation has the following intricacies: 1. Instead of just resetting invalid payloads to optimistic, we'll also reset valid payloads. This is an artifact of our existing implementation. 1. We will only reset payload statuses when we detect an invalid payload present in `proto_array` - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks. 1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that does not have reverted payload statuses. - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.	2022-08-29 14:34:41 +00:00
Michael Sproul	66eca1a882	Refactor op pool for speed and correctness (#3312 ) ## Proposed Changes This PR has two aims: to speed up attestation packing in the op pool, and to fix bugs in the verification of attester slashings, proposer slashings and voluntary exits. The changes are bundled into a single database schema upgrade (v12). Attestation packing is sped up by removing several inefficiencies: - No more recalculation of `attesting_indices` during packing. - No (unnecessary) examination of the `ParticipationFlags`: a bitfield suffices. See `RewardCache`. - No re-checking of attestation validity during packing: the `AttestationMap` provides attestations which are "correct by construction" (I have checked this using Hydra). - No SSZ re-serialization for the clunky `AttestationId` type (it can be removed in a future release). So far the speed-up seems to be roughly 2-10x, from 500ms down to 50-100ms. Verification of attester slashings, proposer slashings and voluntary exits is fixed by: - Tracking the `ForkVersion`s that were used to verify each message inside the `SigVerifiedOp`. This allows us to quickly re-verify that they match the head state's opinion of what the `ForkVersion` should be at the epoch(s) relevant to the message. - Storing the `SigVerifiedOp` on disk rather than the raw operation. This allows us to continue track the fork versions after a reboot. This is mostly contained in this commit 52bb1840ae5c4356a8fc3a51e5df23ed65ed2c7f. ## Additional Info The schema upgrade uses the justified state to re-verify attestations and compute `attesting_indices` for them. It will drop any attestations that fail to verify, by the logic that attestations are most valuable in the few slots after they're observed, and are probably stale and useless by the time a node restarts. Exits and proposer slashings and similarly re-verified to obtain `SigVerifiedOp`s. This PR contains a runtime killswitch `--paranoid-block-proposal` which opts out of all the optimisations in favour of closely verifying every included message. Although I'm quite sure that the optimisations are correct this flag could be useful in the event of an unforeseen emergency. Finally, you might notice that the `RewardCache` appears quite useless in its current form because it is only updated on the hot-path immediately before proposal. My hope is that in future we can shift calls to `RewardCache::update` into the background, e.g. while performing the state advance. It is also forward-looking to `tree-states` compatibility, where iterating and indexing `state.{previous,current}_epoch_participation` is expensive and needs to be minimised.	2022-08-29 09:10:26 +00:00
Divma	8c69d57c2c	Pause sync when EE is offline (#3428 ) ## Issue Addressed #3032 ## Proposed Changes Pause sync when ee is offline. Changes include three main parts: - Online/offline notification system - Pause sync - Resume sync #### Online/offline notification system - The engine state is now guarded behind a new struct `State` that ensures every change is correctly notified. Notifications are only sent if the state changes. The new `State` is behind a `RwLock` (as before) as the synchronization mechanism. - The actual notification channel is a [tokio::sync::watch](https://docs.rs/tokio/latest/tokio/sync/watch/index.html) which ensures only the last value is in the receiver channel. This way we don't need to worry about message order etc. - Sync waits for state changes concurrently with normal messages. #### Pause Sync Sync has four components, pausing is done differently in each: - Block lookups: Disabled while in this state. We drop current requests and don't search for new blocks. Block lookups are infrequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it. - Parent lookups: Disabled while in this state. We drop current requests and don't search for new parents. Parent lookups are even less frequent and I don't think it's worth the extra logic of keeping these and delaying processing. If we later see that this is required, we can add it. - Range: Chains don't send batches for processing to the beacon processor. This is easily done by guarding the channel to the beacon processor and giving it access only if the ee is responsive. I find this the simplest and most powerful approach since we don't need to deal with new sync states and chain segments that are added while the ee is offline will follow the same logic without needing to synchronize a shared state among those. Another advantage of passive pause vs active pause is that we can still keep track of active advertised chain segments so that on resume we don't need to re-evaluate all our peers. - Backfill: Not affected by ee states, we don't pause. #### Resume Sync - Block lookups: Enabled again. - Parent lookups: Enabled again. - Range: Active resume. Since the only real pause range does is not sending batches for processing, resume makes all chains that are holding read-for-processing batches send them. - Backfill: Not affected by ee states, no need to resume. ## Additional Info QUESTION: Originally I made this to notify and change on synced state, but @pawanjay176 on talks with @paulhauner concluded we only need to check online/offline states. The upcheck function mentions extra checks to have a very up to date sync status to aid the networking stack. However, the only need the networking stack would have is this one. I added a TODO to review if the extra check can be removed Next gen of #3094 Will work best with #3439 Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-08-24 23:34:56 +00:00
Paul Hauner	18c61a5e8b	v3.0.0 (#3464 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v3.0.0 ## Additional Info - ~~Blocked on #3439~~ - ~~Blocked on #3459~~ - ~~Blocked on #3463~~ - ~~Blocked on #3462~~ - ~~Requires further testing~~ Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2022-08-22 03:43:08 +00:00
Paul Hauner	df358b864d	Add metrics for EE `PayloadStatus` returns (#3486 ) ## Issue Addressed NA ## Proposed Changes Adds some metrics so we can track payload status responses from the EE. I think this will be useful for troubleshooting and alerting. I also bumped the `BecaonChain::per_slot_task` to `debug` since it doesn't seem too noisy and would have helped us with some things we were debugging in the past. ## Additional Info NA	2022-08-19 04:27:23 +00:00
Michael Sproul	92d597ad23	Modularise slasher backend (#3443 ) ## Proposed Changes Enable multiple database backends for the slasher, either MDBX (default) or LMDB. The backend can be selected using `--slasher-backend={lmdb,mdbx}`. ## Additional Info In order to abstract over the two library's different handling of database lifetimes I've used `Box::leak` to give the `Environment` type a `'static` lifetime. This was the only way I could think of using 100% safe code to construct a self-referential struct `SlasherDB`, where the `OpenDatabases` refers to the `Environment`. I think this is OK, as the `Environment` is expected to live for the life of the program, and both database engines leave the database in a consistent state after each write. The memory claimed for memory-mapping will be freed by the OS and appropriately flushed regardless of whether the `Environment` is actually dropped. We are depending on two `sigp` forks of `libmdbx-rs` and `lmdb-rs`, to give us greater control over MDBX OS support and LMDB's version.	2022-08-15 01:30:56 +00:00
Michael Sproul	4e05f19fb5	Serve Bellatrix preset in BN API (#3425 ) ## Issue Addressed Resolves #3388 Resolves #2638 ## Proposed Changes - Return the `BellatrixPreset` on `/eth/v1/config/spec` by default. - Allow users to opt out of this by providing `--http-spec-fork=altair` (unless there's a Bellatrix fork epoch set). - Add the Altair constants from #2638 and make serving the constants non-optional (the `http-disable-legacy-spec` flag is deprecated). - Modify the VC to only read the `Config` and not to log extra fields. This prevents it from having to muck around parsing the `ConfigAndPreset` fields it doesn't need. ## Additional Info This change is backwards-compatible for the VC and the BN, but is marked as a breaking change for the removal of `--http-disable-legacy-spec`. I tried making `Config` a `superstruct` too, but getting the automatic decoding to work was a huge pain and was going to require a lot of hacks, so I gave up in favour of keeping the default-based approach we have now.	2022-08-10 07:52:59 +00:00
Paul Hauner	a688621919	Add support for beaconAPI in `lcli` functions (#3252 ) ## Issue Addressed NA ## Proposed Changes Modifies `lcli skip-slots` and `lcli transition-blocks` allow them to source blocks/states from a beaconAPI and also gives them some more features to assist with benchmarking. ## Additional Info Breaks the current `lcli skip-slots` and `lcli transition-blocks` APIs by changing some flag names. It should be simple enough to figure out the changes via `--help`. Currently blocked on #3263.	2022-08-09 06:05:13 +00:00
Michael Sproul	df51a73272	Release v2.5.1 (#3406 ) ## Issue Addressed Patch release to address fork choice issues in the presence of clock drift: https://github.com/sigp/lighthouse/pull/3402	2022-08-03 04:23:09 +00:00
Paul Hauner	2983235650	v2.5.0 (#3392 ) ## Issue Addressed NA ## Proposed Changes Bump versions. ## Additional Info - ~~Blocked on #3383~~ - ~~Awaiting further testing.~~	2022-08-01 03:41:08 +00:00
realbigsean	6c2d8b2262	Builder Specs v0.2.0 (#3134 ) ## Issue Addressed https://github.com/sigp/lighthouse/issues/3091 Extends https://github.com/sigp/lighthouse/pull/3062, adding pre-bellatrix block support on blinded endpoints and allowing the normal proposal flow (local payload construction) on blinded endpoints. This resulted in better fallback logic because the VC will not have to switch endpoints on failure in the BN <> Builder API, the BN can just fallback immediately and without repeating block processing that it shouldn't need to. We can also keep VC fallback from the VC<>BN API's blinded endpoint to full endpoint. ## Proposed Changes - Pre-bellatrix blocks on blinded endpoints - Add a new `PayloadCache` to the execution layer - Better fallback-from-builder logic ## Todos - [x] Remove VC transition logic - [x] Add logic to only enable builder flow after Merge transition finalization - [x] Tests - [x] Fix metrics - [x] Rustdocs Co-authored-by: Mac L <mjladson@pm.me> Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-07-30 00:22:37 +00:00
Michael Sproul	d04fde3ba9	Remove equivocating validators from fork choice (#3371 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/3241 Closes https://github.com/sigp/lighthouse/issues/3242 ## Proposed Changes * [x] Implement logic to remove equivocating validators from fork choice per https://github.com/ethereum/consensus-specs/pull/2845 * [x] Update tests to v1.2.0-rc.1. The new test which exercises `equivocating_indices` is passing. * [x] Pull in some SSZ abstractions from the `tree-states` branch that make implementing Vec-compatible encoding for types like `BTreeSet` and `BTreeMap`. * [x] Implement schema upgrades and downgrades for the database (new schema version is V11). * [x] Apply attester slashings from blocks to fork choice ## Additional Info * This PR doesn't need the `BTreeMap` impl, but `tree-states` does, and I don't think there's any harm in keeping it. But I could also be convinced to drop it. Blocked on #3322.	2022-07-28 09:43:41 +00:00
realbigsean	20ebf1f3c1	Realized unrealized experimentation (#3322 ) ## Issue Addressed Add a flag that optionally enables unrealized vote tracking. Would like to test out on testnets and benchmark differences in methods of vote tracking. This PR includes a DB schema upgrade to enable to new vote tracking style. Co-authored-by: realbigsean <sean@sigmaprime.io> Co-authored-by: Paul Hauner <paul@paulhauner.com> Co-authored-by: sean <seananderson33@gmail.com> Co-authored-by: Mac L <mjladson@pm.me>	2022-07-25 23:53:26 +00:00
Mac L	bb5a6d2cca	Add `execution_optimistic` flag to HTTP responses (#3070 ) ## Issue Addressed #3031 ## Proposed Changes Updates the following API endpoints to conform with https://github.com/ethereum/beacon-APIs/pull/190 and https://github.com/ethereum/beacon-APIs/pull/196 - [x] `beacon/states/{state_id}/root` - [x] `beacon/states/{state_id}/fork` - [x] `beacon/states/{state_id}/finality_checkpoints` - [x] `beacon/states/{state_id}/validators` - [x] `beacon/states/{state_id}/validators/{validator_id}` - [x] `beacon/states/{state_id}/validator_balances` - [x] `beacon/states/{state_id}/committees` - [x] `beacon/states/{state_id}/sync_committees` - [x] `beacon/headers` - [x] `beacon/headers/{block_id}` - [x] `beacon/blocks/{block_id}` - [x] `beacon/blocks/{block_id}/root` - [x] `beacon/blocks/{block_id}/attestations` - [x] `debug/beacon/states/{state_id}` - [x] `debug/beacon/heads` - [x] `validator/duties/attester/{epoch}` - [x] `validator/duties/proposer/{epoch}` - [x] `validator/duties/sync/{epoch}` Updates the following Server-Sent Events: - [x] `events?topics=head` - [x] `events?topics=block` - [x] `events?topics=finalized_checkpoint` - [x] `events?topics=chain_reorg` ## Backwards Incompatible There is a very minor breaking change with the way the API now handles requests to `beacon/blocks/{block_id}/root` and `beacon/states/{state_id}/root` when `block_id` or `state_id` is the `Root` variant of `BlockId` and `StateId` respectively. Previously a request to a non-existent root would simply echo the root back to the requester: ``` curl "http://localhost:5052/eth/v1/beacon/states/0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/root" {"data":{"root":"0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}} ``` Now it will return a `404`: ``` curl "http://localhost:5052/eth/v1/beacon/blocks/0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/root" {"code":404,"message":"NOT_FOUND: beacon block with root 0xaaaa…aaaa","stacktraces":[]} ``` In addition to this is the block root `0x0000000000000000000000000000000000000000000000000000000000000000` previously would return the genesis block. It will now return a `404`: ``` curl "http://localhost:5052/eth/v1/beacon/blocks/0x0000000000000000000000000000000000000000000000000000000000000000" {"code":404,"message":"NOT_FOUND: beacon block with root 0x0000…0000","stacktraces":[]} ``` ## Additional Info - `execution_optimistic` is always set, and will return `false` pre-Bellatrix. I am also open to the idea of doing something like `#[serde(skip_serializing_if = "Option::is_none")]`. - The value of `execution_optimistic` is set to `false` where possible. Any computation that is reliant on the `head` will simply use the `ExecutionStatus` of the head (unless the head block is pre-Bellatrix). Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-07-25 08:23:00 +00:00
Paul Hauner	21dec6f603	v2.4.0 (#3360 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v2.4.0 ## Additional Info Blocked on: - ~~#3349~~ - ~~#3347~~	2022-07-21 22:02:36 +00:00
Michael Sproul	e32868458f	Set safe block hash to justified (#3347 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/3189. ## Proposed Changes - Always supply the justified block hash as the `safe_block_hash` when calling `forkchoiceUpdated` on the execution engine. - Refactor the `get_payload` routine to use the new `ForkchoiceUpdateParameters` struct rather than just the `finalized_block_hash`. I think this is a nice simplification and that the old way of computing the `finalized_block_hash` was unnecessary, but if anyone sees reason to keep that approach LMK.	2022-07-21 05:45:37 +00:00
Pawan Dhananjay	f9b9658711	Add merge support to simulator (#3292 ) ## Issue Addressed N/A ## Proposed Changes Make simulator merge compatible. Adds a `--post_merge` flag to the eth1 simulator that enables a ttd and simulates the merge transition. Uses the `MockServer` in the execution layer test utils to simulate a dummy execution node. Adds the merge transition simulation to CI.	2022-07-18 23:15:40 +00:00
Pawan Dhananjay	da7b7a0f60	Make transactions in execution layer integration tests (#3320 ) ## Issue Addressed Resolves #3159 ## Proposed Changes Sends transactions to the EE before requesting for a payload in the `execution_integration_tests`. Made some changes to the integration tests in order to be able to sign and publish transactions to the EE: 1. `genesis.json` for both geth and nethermind was modified to include pre-funded accounts that we know private keys for 2. Using the unauthenticated port again in order to make `eth_sendTransaction` and calls from the `personal` namespace to import keys Also added a `fcu` call with `PayloadAttributes` before calling `getPayload` in order to give EEs sufficient time to pack transactions into the payload.	2022-07-18 01:51:36 +00:00
Divma	3dc323b035	Fix RUSTSEC-2022-0032 (#3311 ) ## Issue Addressed Failure of cargo audit for [RUSTSEC-2022-0032](https://rustsec.org/advisories/RUSTSEC-2022-0032) ## Proposed Changes Cargo update does the trick again ## Additional Info na	2022-07-05 23:36:42 +00:00
Paul Hauner	be4e261e74	Use async code when interacting with EL (#3244 ) ## Overview This rather extensive PR achieves two primary goals: 1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state. 2. Refactors fork choice, block production and block processing to `async` functions. Additionally, it achieves: - Concurrent forkchoice updates to the EL and cache pruning after a new head is selected. - Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production. - Concurrent per-block-processing and execution payload verification during block processing. - The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?): - I had to do this to deal with sending blocks into spawned tasks. - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones. - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap. - Avoids cloning all the blocks in every chain segment during sync. - It also has the potential to clean up our code where we need to pass an owned block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅) - The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs. For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273 ## Changes to `canonical_head` and `fork_choice` Previously, the `BeaconChain` had two separate fields: ``` canonical_head: RwLock<Snapshot>, fork_choice: RwLock<BeaconForkChoice> ``` Now, we have grouped these values under a single struct: ``` canonical_head: CanonicalHead { cached_head: RwLock<Arc<Snapshot>>, fork_choice: RwLock<BeaconForkChoice> } ``` Apart from ergonomics, the only actual change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously. ## Breaking Changes ### The `state` (root) field in the `finalized_checkpoint` SSE event Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event: 1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`. 4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots. Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](`de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)`) it uses [`getStateRootFromBlockRoot`](`de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)`) which uses (1). I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku. ## Notes for Reviewers I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct. I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking". I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run at least once means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows when it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it. I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around. Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2. You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests: - Changing tests to be `tokio::async` tests. - Adding `.await` to fork choice, block processing and block production functions. - Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`. - Wrapping `SignedBeaconBlock` in an `Arc`. - In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant. I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic. Co-authored-by: Mac L <mjladson@pm.me>	2022-07-03 05:36:50 +00:00
realbigsean	a7da0677d5	Remove builder redundancy (#3294 ) ## Issue Addressed This PR is a subset of the changes in #3134. Unstable will still not function correctly with the new builder spec once this is merged, #3134 should be used on testnets ## Proposed Changes - Removes redundancy in "builders" (servers implementing the builder spec) - Renames `payload-builder` flag to `builder` - Moves from old builder RPC API to new HTTP API, but does not implement the validator registration API (implemented in https://github.com/sigp/lighthouse/pull/3194) Co-authored-by: sean <seananderson33@gmail.com> Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-07-01 01:15:19 +00:00
realbigsean	f6ec44f0dd	Register validator api (#3194 ) ## Issue Addressed Lays the groundwork for builder API changes by implementing the beacon-API's new `register_validator` endpoint ## Proposed Changes - Add a routine in the VC that runs on startup (re-try until success), once per epoch or whenever `suggested_fee_recipient` is updated, signing `ValidatorRegistrationData` and sending it to the BN. - TODO: `gas_limit` config options https://github.com/ethereum/builder-specs/issues/17 - BN only sends VC registration data to builders on demand, but VC registration data does update the BN's prepare proposer cache and send an updated fcU to a local EE. This is necessary for fee recipient consistency between the blinded and full block flow in the event of fallback. Having the BN only send registration data to builders on demand gives feedback directly to the VC about relay status. Also, since the BN has no ability to sign these messages anyways (so couldn't refresh them if it wanted), and validator registration is independent of the BN head, I think this approach makes sense. - Adds upcoming consensus spec changes for this PR https://github.com/ethereum/consensus-specs/pull/2884 - I initially applied the bit mask based on a configured application domain.. but I ended up just hard coding it here instead because that's how it's spec'd in the builder repo. - Should application mask appear in the api? Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-06-30 00:49:21 +00:00
Pawan Dhananjay	5de00b7ee8	Unify execution layer endpoints (#3214 ) ## Issue Addressed Resolves #3069 ## Proposed Changes Unify the `eth1-endpoints` and `execution-endpoints` flags in a backwards compatible way as described in https://github.com/sigp/lighthouse/issues/3069#issuecomment-1134219221 Users have 2 options: 1. Use multiple non auth execution endpoints for deposit processing pre-merge 2. Use a single jwt authenticated execution endpoint for both execution layer and deposit processing post merge Related https://github.com/sigp/lighthouse/issues/3118 To enable jwt authenticated deposit processing, this PR removes the calls to `net_version` as the `net` namespace is not exposed in the auth server in execution clients. Moving away from using `networkId` is a good step in my opinion as it doesn't provide us with any added guarantees over `chainId`. See https://github.com/ethereum/consensus-specs/issues/2163 and https://github.com/sigp/lighthouse/issues/2115 Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-06-29 09:07:09 +00:00
Michael Sproul	53b2b500db	Extend block reward APIs (#3290 ) ## Proposed Changes Add a new HTTP endpoint `POST /lighthouse/analysis/block_rewards` which takes a vec of `BeaconBlock`s as input and outputs the `BlockReward`s for them. Augment the `BlockReward` struct with the attestation data for attestations in the block, which simplifies access to this information from blockprint. Using attestation data I've been able to make blockprint up to 95% accurate across Prysm/Lighthouse/Teku/Nimbus. I hope to go even higher using a bunch of synthetic blocks produced for Prysm/Nimbus/Lodestar, which are underrepresented in the current training data.	2022-06-29 04:50:37 +00:00
Paul Hauner	45b2eb18bc	v2.3.2-rc.0 (#3289 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2022-06-28 03:03:30 +00:00
Akihito Nakano	082ed35bdc	Test the pruning of excess peers using randomly generated input (#3248 ) ## Issue Addressed https://github.com/sigp/lighthouse/issues/3092 ## Proposed Changes Added property-based tests for the pruning implementation. A randomly generated input for the test contains connection direction, subnets, and scores. ## Additional Info I left some comments on this PR, what I have tried, and [a question](https://github.com/sigp/lighthouse/pull/3248#discussion_r891981969). Co-authored-by: Diva M <divma@protonmail.com>	2022-06-25 22:22:34 +00:00
Michael Sproul	8faaa35b58	Enable malloc metrics for the VC (#3279 ) ## Issue Addressed Following up from https://github.com/sigp/lighthouse/pull/3223#issuecomment-1158718102, it has been observed that the validator client uses vastly more memory in some compilation configurations than others. Compiling with Cross and then putting the binary into an Ubuntu 22.04 image seems to use 3x more memory than compiling with Cargo directly on Debian bullseye. ## Proposed Changes Enable malloc metrics for the validator client. This will hopefully allow us to see the difference between the two compilation configs and compare heap fragmentation. This PR doesn't enable malloc tuning for the VC because it was found to perform significantly worse. The `--disable-malloc-tuning` flag is repurposed to just disable the metrics.	2022-06-20 23:20:30 +00:00
Divma	21b3425a12	Update cargo lockfile to fix RUSTSEC-2022-0025, RUSTSEC-2022-0026 and RUSTSEC-2022-0027 (#3278 ) ## Issue Addressed Fixes [RUSTSEC-2022-0025](https://rustsec.org/advisories/RUSTSEC-2022-0025), [RUSTSEC-2022-0026](https://rustsec.org/advisories/RUSTSEC-2022-0026) and [RUSTSEC-2022-0027](https://rustsec.org/advisories/RUSTSEC-2022-0027) raised in [this test run](https://github.com/sigp/lighthouse/runs/6943343852?check_suite_focus=true) ## Proposed Changes a `cargo update` was enough ## Additional Info	2022-06-18 23:59:43 +00:00
Paul Hauner	564d7da656	v2.3.1 (#3262 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2022-06-14 05:25:38 +00:00
Divma	56b4cd88ca	minor libp2p upgrade (#3259 ) ## Issue Addressed Upgrades libp2p	2022-06-09 23:48:51 +00:00
Divma	58e223e429	update libp2p (#3233 ) ## Issue Addressed na ## Proposed Changes Updates libp2p to https://github.com/libp2p/rust-libp2p/pull/2662 ## Additional Info From comments on the relevant PRs listed, we should pay attention at peer management consistency, but I don't think anything weird will happen. This is running in prater tok and sin	2022-06-07 02:35:55 +00:00
Paul Hauner	6f732986f1	v2.3.0 (#3222 ) ## Issue Addressed NA ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info - Pending testing on our infra. Please do not merge	2022-05-30 01:35:10 +00:00
Paul Hauner	f4aa17ef85	v2.3.0-rc.0 (#3218 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2022-05-25 05:29:26 +00:00
Michael Sproul	6eaeaa542f	Fix Rust 1.61 clippy lints (#3192 ) ## Issue Addressed This fixes the low-hanging Clippy lints introduced in Rust 1.61 (due any hour now). It _ignores_ one lint, because fixing it requires a structural refactor of the validator client that needs to be done delicately. I've started on that refactor and will create another PR that can be reviewed in more depth in the coming days. I think we should merge this PR in the meantime to unblock CI.	2022-05-20 05:02:13 +00:00
Michael Sproul	8fa032c8ae	Run fork choice before block proposal (#3168 ) ## Issue Addressed Upcoming spec change https://github.com/ethereum/consensus-specs/pull/2878 ## Proposed Changes 1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block. 2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_. 3. Remove the fork choice run from the state advance timer that occurred before advancing the state. ## Additional Info ### Block Proposal Accuracy This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue. ### Attestation Accuracy This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B. ### Why remove the call before the state advance? If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea). ### Performance Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes. ### Implementation Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.	2022-05-20 05:02:11 +00:00
will	0428018cc1	Fix http header accept parsing problem (#3185 ) ## Issue Addressed Which issue # does this PR address? #3114 ## Proposed Changes 1. introduce `mime` package 2. Parse `Accept` field in the header with `mime` ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-05-18 06:50:50 +00:00
Paul Hauner	38050fa460	Allow `TaskExecutor` to be used in `async` tests (#3178 ) # Description Since the `TaskExecutor` currently requires a `Weak<Runtime>`, it's impossible to use it in an async test where the `Runtime` is created outside our scope. Whilst we could create a new `Runtime` instance inside the async test, dropping that `Runtime` would cause a panic (you can't drop a `Runtime` in an async context). To address this issue, this PR creates the `enum Handle`, which supports either: - A `Weak<Runtime>` (for use in our production code) - A `Handle` to a runtime (for use in testing) In theory, there should be no change to the behaviour of our production code (beyond some slightly different descriptions in HTTP 500 errors), or even our tests. If there is no change, you might ask "why bother?". There are two PRs (#3070 and #3175) that are waiting on these fixes to introduce some new tests. Since we've added the EL to the `BeaconChain` (for the merge), we are now doing more async stuff in tests. I've also added a `RuntimeExecutor` to the `BeaconChainTestHarness`. Whilst that's not immediately useful, it will become useful in the near future with all the new async testing.	2022-05-16 08:35:59 +00:00
Michael Sproul	bcdd960ab1	Separate execution payloads in the DB (#3157 ) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. ⚠️ This is achieved in a backwards-incompatible way for networks that have already merged ⚠️. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-05-12 00:42:17 +00:00
Michael Sproul	aa72088f8f	v2.2.1 (#3149 ) ## Issue Addressed Addresses sync stalls on v2.2.0 (i.e. https://github.com/sigp/lighthouse/issues/3147). ## Additional Info I've avoided doing a full `cargo update` because I noticed there's a new patch version of libp2p and thought it could do with some more testing. Co-authored-by: Paul Hauner <paul@paulhauner.com>	2022-04-12 02:52:12 +00:00
Michael Sproul	bac7c3fa54	v2.2.0 (#3139 ) ## Proposed Changes Cut release v2.2.0 including proposer boost. ## Additional Info I also updated the clippy lints for the imminent release of Rust 1.60, although LH v2.2.0 will continue to compile using Rust 1.58 (our MSRV).	2022-04-05 02:53:09 +00:00
Michael Sproul	4d0122444b	Update and consolidate dependencies (#3136 ) ## Proposed Changes I did some gardening 🌳 in our dependency tree: - Remove duplicate versions of `warp` (git vs patch) - Remove duplicate versions of lots of small deps: `cpufeatures`, `ethabi`, `ethereum-types`, `bitvec`, `nix`, `libsecp256k1`. - Update MDBX (should resolve #3028). I tested and Lighthouse compiles on Windows 11 now. - Restore `psutil` back to upstream - Make some progress updating everything to rand 0.8. There are a few crates stuck on 0.7. Hopefully this puts us on a better footing for future `cargo audit` issues, and improves compile times slightly. ## Additional Info Some crates are held back by issues with `zeroize`. libp2p-noise depends on [`chacha20poly1305`](https://crates.io/crates/chacha20poly1305) which depends on zeroize < v1.5, and we can only have one version of zeroize because it's post 1.0 (see https://github.com/rust-lang/cargo/issues/6584). The latest version of `zeroize` is v1.5.4, which is used by the new versions of many other crates (e.g. `num-bigint-dig`). Once a new version of chacha20poly1305 is released we can update libp2p-noise and upgrade everything to the latest `zeroize` version. I've also opened a PR to `blst` related to zeroize: https://github.com/supranational/blst/pull/111	2022-04-04 00:26:16 +00:00
Michael Sproul	41e7a07c51	Add `lighthouse db` command (#3129 ) ## Proposed Changes Add a `lighthouse db` command with three initial subcommands: - `lighthouse db version`: print the database schema version. - `lighthouse db migrate --to N`: manually upgrade (or downgrade!) the database to a different version. - `lighthouse db inspect --column C`: log the key and size in bytes of every value in a given `DBColumn`. This PR lays the groundwork for other changes, namely: - Mark's fast-deposit sync (https://github.com/sigp/lighthouse/pull/2915), for which I think we should implement a database downgrade (from v9 to v8). - My `tree-states` work, which already implements a downgrade (v10 to v8). - Standalone purge commands like `lighthouse db purge-dht` per https://github.com/sigp/lighthouse/issues/2824. ## Additional Info I updated the `strum` crate to 0.24.0, which necessitated some changes in the network code to remove calls to deprecated methods. Thanks to @winksaville for the motivation, and implementation work that I used as a source of inspiration (https://github.com/sigp/lighthouse/pull/2685).	2022-04-01 00:58:59 +00:00
realbigsean	ea783360d3	Kiln mev boost (#3062 ) ## Issue Addressed MEV boost compatibility ## Proposed Changes See #2987 ## Additional Info This is blocked on the stabilization of a couple specs, [here](https://github.com/ethereum/beacon-APIs/pull/194) and [here](https://github.com/flashbots/mev-boost/pull/20). Additional TODO's and outstanding questions - [ ] MEV boost JWT Auth - [ ] Will `builder_proposeBlindedBlock` return the revealed payload for the BN to propogate - [ ] Should we remove `private-tx-proposals` flag and communicate BN <> VC with blinded blocks by default once these endpoints enter the beacon-API's repo? This simplifies merge transition logic. Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: realbigsean <sean@sigmaprime.io>	2022-03-31 07:52:23 +00:00

1 2 3 4 5 ...

433 Commits