lighthouse

Author	SHA1	Message	Date
Paul Hauner	0655006e87	Clarify error log when registering validators (#3650 ) ## Issue Addressed NA ## Proposed Changes Adds clarification to an error log when there is an error submitting a validator registration. There seems to be a few cases where relays return errors during validator registration, including spurious timeouts and when a validator has been very recently activated/made pending. Changing this log helps indicate that it's "just another registration error" rather than something more serious. I didn't drop this to a `WARN` since I still have hope we can eliminate these errors completely by chatting with relays and adjusting timeouts. ## Additional Info NA	2022-11-07 06:48:31 +00:00
Divma	8600645f65	Fix rust 1.65 lints (#3682 ) ## Issue Addressed New lints for rust 1.65 ## Proposed Changes Notable change is the identification or parameters that are only used in recursion ## Additional Info na	2022-11-04 07:43:43 +00:00
ethDreamer	e8604757a2	Deposit Cache Finalization & Fast WS Sync (#2915 ) ## Summary The deposit cache now has the ability to finalize deposits. This will cause it to drop unneeded deposit logs and hashes in the deposit Merkle tree that are no longer required to construct deposit proofs. The cache is finalized whenever the latest finalized checkpoint has a new `Eth1Data` with all deposits imported. This has three benefits: 1. Improves the speed of constructing Merkle proofs for deposits as we can just replay deposits since the last finalized checkpoint instead of all historical deposits when re-constructing the Merkle tree. 2. Significantly faster weak subjectivity sync as the deposit cache can be transferred to the newly syncing node in compressed form. The Merkle tree that stores `N` finalized deposits requires a maximum of `log2(N)` hashes. The newly syncing node then only needs to download deposits since the last finalized checkpoint to have a full tree. 3. Future proofing in preparation for [EIP-4444](https://eips.ethereum.org/EIPS/eip-4444) as execution nodes will no longer be required to store logs permanently so we won't always have all historical logs available to us. ## More Details Image to illustrate how the deposit contract merkle tree evolves and finalizes along with the resulting `DepositTreeSnapshot` ![image](https://user-images.githubusercontent.com/37123614/151465302-5fc56284-8a69-4998-b20e-45db3934ac70.png) ## Other Considerations I've changed the structure of the `SszDepositCache` so once you load & save your database from this version of lighthouse, you will no longer be able to load it from older versions. Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>	2022-10-30 04:04:24 +00:00
Divma	46fbf5b98b	Update discv5 (#3171 ) ## Issue Addressed Updates discv5 Pending on - [x] #3547 - [x] Alex upgrades his deps ## Proposed Changes updates discv5 and the enr crate. The only relevant change would be some clear indications of ipv4 usage in lighthouse ## Additional Info Functionally, this should be equivalent to the prev version. As draft pending a discv5 release	2022-10-28 05:40:06 +00:00
Kausik Das ✪	5bd1501cb1	Book spelling and grammar corrections (#3659 ) ## Issue Addressed There are few spelling and grammar errors in the book. ## Proposed Changes Corrected those spelling and grammar errors in the below files - book/src/advanced-release-candidates.md - book/src/advanced_networking.md - book/src/builders.md - book/src/key-management.md - book/src/merge-migration.md - book/src/wallet-create.md Co-authored-by: Kausik Das <kausik007007@gmail.com> Co-authored-by: Kausik Das ✪ <kausik007007@gmail.com>	2022-10-28 03:23:50 +00:00
Giulio rebuffo	f2f920dec8	Added lightclient server side containers (#3655 ) ## Issue Addressed This PR partially addresses #3651 ## Proposed Changes This PR adds the following containers types from [the lightclient specs](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/light-client/sync-protocol.md): `LightClientUpdate`, `LightClientFinalityUpdate`, `LightClientOptimisticUpdate` and `LightClientBootstrap`. It also implements the creation of each updates as delined by this [document](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/light-client/full-node.md). ## Additional Info Here is a brief description of what each of these container signify: `LightClientUpdate`: This container is only provided by server (full node) to lightclients when catching up new sync committees beetwen periods and we want possibly one lightclient update ready for each post-altair period the lighthouse node go over. it is needed in the resp/req in method `light_client_update_by_range`. `LightClientFinalityUpdate/LightClientFinalityUpdate`: Lighthouse will need only the latest of each of this kind of updates, so no need to store them in the database, we can just store the latest one of each one in memory and then just supply them via gossip or respreq, only the latest ones are served by a full node. finality updates marks the transition to a new finalized header, while optimistic updates signify new non-finalized header which are imported optimistically. `LightClientBootstrap`: This object is retrieved by lightclients during the bootstrap process after a finalized checkpoint is retrieved, ideally we want to store a LightClientBootstrap for each finalized root and then serve each of them by finalized root in respreq protocol id `light_client_bootstrap`. Little digression to how we implement the creation of each updates: the creation of a optimistic/finality update is just a version of the lightclient_update creation mechanism with less fields being set, there is underlying concept of inheritance, if you look at the specs it becomes very obvious that a lightclient update is just an extension of a finality update and a finality update an extension to an optimistic update. ## Extra note `LightClientStore` is not implemented as it is only useful as internal storage design for the lightclient side.	2022-10-28 03:23:49 +00:00
Michael Sproul	6d5a2b509f	Release v3.2.1 (#3660 ) ## Proposed Changes Patch release to include the performance regression fix https://github.com/sigp/lighthouse/pull/3658. ## Additional Info ~~Blocked on the merge of https://github.com/sigp/lighthouse/pull/3658.~~	2022-10-26 09:38:25 +00:00
Michael Sproul	77eabc5401	Revert "Optimise HTTP validator lookups" (#3658 ) ## Issue Addressed This reverts commit `ca9dc8e094` (PR #3559) with some modifications. ## Proposed Changes Unfortunately that PR introduced a performance regression in fork choice. The optimisation _intended_ to build the exit and pubkey caches on the head state _only if_ they were not already built. However, due to the head state always being cloned without these caches, we ended up building them every time the head changed, leading to a ~70ms+ penalty on mainnet. `fcfd02aeec/beacon_node/beacon_chain/src/canonical_head.rs (L633-L636)` I believe this is a severe enough regression to justify immediately releasing v3.2.1 with this change. ## Additional Info I didn't fully revert #3559, because there were some unrelated deletions of dead code in that PR which I figured we may as well keep. An alternative would be to clone the extra caches, but this likely still imposes some cost, so in the interest of applying a conservative fix quickly, I think reversion is the best approach. The optimisation from #3559 was not even optimising a particularly significant path, it was mostly for VCs running larger numbers of inactive keys. We can re-do it in the `tree-states` world where cache clones are cheap.	2022-10-26 06:50:04 +00:00
Paul Hauner	fcfd02aeec	Release v3.2.0 (#3647 ) ## Issue Addressed NA ## Proposed Changes Bump version to `v3.2.0` ## Additional Info - ~~Blocked on #3597~~ - ~~Blocked on #3645~~ - ~~Blocked on #3653~~ - ~~Requires additional testing~~	2022-10-25 06:36:51 +00:00
Divma	3a5888e53d	Ban and unban peers at the swarm level (#3653 ) ## Issue Addressed I missed this from https://github.com/sigp/lighthouse/pull/3491. peers were being banned at the behaviour level only. The identify errors are explained by this as well ## Proposed Changes Add banning and unbanning ## Additional Info Befor,e having tests that catch this was hard because the swarm was outside the behaviour. We could now have tests that prevent something like this in the future	2022-10-24 21:39:30 +00:00
Michael Sproul	dbb93cd0d2	bors: require slasher and syncing sim tests (#3645 ) ## Issue Addressed I noticed that [this build](https://github.com/sigp/lighthouse/actions/runs/3269950873/jobs/5378036501) wasn't marked failed by Bors when the `syncing-simulator-ubuntu` job failed. This is because that job is absent from the `bors.toml` config. ## Proposed Changes Add missing jobs to Bors config so that they are required: - `syncing-simulator-ubuntu` - `slasher-tests` - `disallowed-from-async-lint` The `disallowed-from-async-lint` was previously allowed to fail because it was considered beta, but I think it's stable enough now we may as well require it.	2022-10-19 22:55:50 +00:00
pinkiebell	d0efb6b18a	beacon_node: add --disable-deposit-contract-sync flag (#3597 ) Overrides any previous option that enables the eth1 service. Useful for operating a `light` beacon node. Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-10-19 22:55:49 +00:00
GeemoCandama	c5cd0d9b3f	add execution-timeout-multiplier flag to optionally increase timeouts (#3631 ) ## Issue Addressed Add flag to lengthen execution layer timeouts Which issue # does this PR address? #3607 ## Proposed Changes Added execution-timeout-multiplier flag and a cli test to ensure the execution layer config has the multiplier set correctly. Please list or describe the changes introduced by this PR. Add execution_timeout_multiplier to the execution layer config as Option<u32> and pass the u32 to HttpJsonRpc. ## Additional Info Not certain that this is the best way to implement it so I'd appreciate any feedback. Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-10-18 04:02:07 +00:00
Michael Sproul	edf23bb40e	Fix attestation shuffling filter (#3629 ) ## Issue Addressed Fix a bug in block production that results in blocks with 0 attestations during the first slot of an epoch. The bug is marked by debug logs of the form: > DEBG Discarding attestation because of missing ancestor, block_root: 0x3cc00d9c9e0883b2d0db8606278f2b8423d4902f9a1ee619258b5b60590e64f8, pivot_slot: 4042591 It occurs when trying to look up the shuffling decision root for an attestation from a slot which is prior to fork choice's finalized block. This happens frequently when proposing in the first slot of the epoch where we have: - `current_epoch == n` - `attestation.data.target.epoch == n - 1` - attestation shuffling epoch `== n - 3` (decision block being the last block of `n - 3`) - `state.finalized_checkpoint.epoch == n - 2` (first block of `n - 2` is finalized) Hence the shuffling decision slot is out of range of the fork choice backwards iterator _by a single slot_. Unfortunately this bug was hidden when we weren't pruning fork choice, and then reintroduced in v2.5.1 when we fixed the pruning (https://github.com/sigp/lighthouse/releases/tag/v2.5.1). There's no way to turn that off or disable the filtering in our current release, so we need a new release to fix this issue. Fortunately, it also does not occur on every epoch boundary because of the gradual pruning of fork choice every 256 blocks (~8 epochs): `01e84b71f5/consensus/proto_array/src/proto_array_fork_choice.rs (L16)` `01e84b71f5/consensus/proto_array/src/proto_array.rs (L713-L716)` So the probability of proposing a 0-attestation block given a proposal assignment is approximately `1/32 * 1/8 = 0.39%`. ## Proposed Changes - Load the block's shuffling ID from fork choice and verify it against the expected shuffling ID of the head state. This code was initially written before we had settled on a representation of shuffling IDs, so I think it's a nice simplification to make use of them here rather than more ad-hoc logic that fundamentally does the same thing. ## Additional Info Thanks to @moshe-blox for noticing this issue and bringing it to our attention.	2022-10-18 04:02:06 +00:00
Michael Sproul	59ec6b71b8	Consensus context with proposer index caching (#3604 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2371 ## Proposed Changes Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`. With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required. ## Additional Info In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards. There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).	2022-10-15 22:25:54 +00:00
Michael Sproul	e4cbdc1c77	Optimistic sync spec tests (v1.2.0) (#3564 ) ## Issue Addressed Implements new optimistic sync test format from https://github.com/ethereum/consensus-specs/pull/2982. ## Proposed Changes - Add parsing and runner support for the new test format. - Extend the mock EL with a set of canned responses keyed by block hash. Although this doubles up on some of the existing functionality I think it's really nice to use compared to the `preloaded_responses` or static responses. I think we could write novel new opt sync tests using these primtives much more easily than the previous ones. Forks are natively supported, and different responses to `forkchoiceUpdated` and `newPayload` are also straight-forward. ## Additional Info Blocked on merge of the spec PR and release of new test vectors.	2022-10-15 22:25:52 +00:00
Michael Sproul	ca9dc8e094	Optimise HTTP validator lookups (#3559 ) ## Issue Addressed While digging around in some logs I noticed that queries for validators by pubkey were taking 10ms+, which seemed too long. This was due to a loop through the entire validator registry for each lookup. ## Proposed Changes Rather than using a loop through the register, this PR utilises the pubkey cache which is usually initialised at the head. In case the cache isn't built, we fall back to the previous loop logic. In the vast majority of cases I expect the cache will be built, as the validator client queries at the `head` where all caches should be built. ## Additional Info I had to modify the cache build that runs after fork choice to build the pubkey cache. I think it had been optimised out, perhaps accidentally. I think it's preferable to have the exit cache and the pubkey cache built on the head state, as they are required for verifying deposits and exits respectively, and we may as well build them off the hot path of block processing. Previously they'd get built the first time a deposit or exit needed to be verified. I've deleted the unused `map_state` function which was obsoleted by `map_state_and_execution_optimistic`.	2022-10-15 22:25:51 +00:00
will	9f242137b0	Add a new bls test (#3235 ) ## Issue Addressed Which issue # does this PR address? #2629 ## Proposed Changes Please list or describe the changes introduced by this PR. 1. ci would dowload the bls test cases from https://github.com/ethereum/bls12-381-tests/ 2. all the bls test cases(except eth ones) would use cases in the archive from step one 3. The bls test cases from https://github.com/ethereum/consensus-spec-tests would stay there and no use . For the future , these bls test cases would be remove suggested from https://github.com/ethereum/consensus-spec-tests/issues/25 . So it would do no harm and compatible for future cases. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers. Question: I am not sure if I should implement tests about `deserialization_G1`, `deserialization_G2` and `hash_to_G2` for the issue.	2022-10-12 23:40:42 +00:00
mariuspod	242ae21e5d	Pass EL JWT secret key via cli flag (#3568 ) ## Proposed Changes In this change I've added a new beacon_node cli flag `--execution-jwt-secret-key` for passing the JWT secret directly as string. Without this flag, it was non-trivial to pass a secrets file containing a JWT secret key without compromising its contents into some management repo or fiddling around with manual file mounts for cloud-based deployments. When used in combination with environment variables, the secret can be injected into container-based systems like docker & friends quite easily. It's both possible to either specify the file_path to the JWT secret or pass the JWT secret directly. I've modified the docs and attached a test as well. ## Additional Info The logic has been adapted a bit so that either one of `--execution-jwt` or `--execution-jwt-secret-key` must be set when specifying `--execution-endpoint` so that it's still compatible with the semantics before this change and there's at least one secret provided.	2022-10-04 12:41:03 +00:00
Divma	4926e3967f	[DEV FEATURE] Deterministic long lived subnets (#3453 ) ## Issue Addressed #2847 ## Proposed Changes Add under a feature flag the required changes to subscribe to long lived subnets in a deterministic way ## Additional Info There is an additional required change that is actually searching for peers using the prefix, but I find that it's best to make this change in the future	2022-10-04 10:37:48 +00:00
GeemoCandama	6a92bf70e4	CLI tests for logging flags (#3609 ) ## Issue Addressed Adding CLI tests for logging flags: log-color and disable-log-timestamp Which issue # does this PR address? #3588 ## Proposed Changes Add CLI tests for logging flags as described in #3588 Please list or describe the changes introduced by this PR. Added logger_config to client::Config as suggested. Implemented Default for LoggerConfig based on what was being done elsewhere in the repo. Created 2 tests for each flag addressed. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-10-04 08:33:40 +00:00
Pawan Dhananjay	8728c40102	Remove fallback support from eth1 service (#3594 ) ## Issue Addressed N/A ## Proposed Changes With https://github.com/sigp/lighthouse/pull/3214 we made it such that you can either have 1 auth endpoint or multiple non auth endpoints. Now that we are post merge on all networks (testnets and mainnet), we cannot progress a chain without a dedicated auth execution layer connection so there is no point in having a non-auth eth1-endpoint for syncing deposit cache. This code removes all fallback related code in the eth1 service. We still keep the single non-auth endpoint since it's useful for testing. ## Additional Info This removes all eth1 fallback related metrics that were relevant for the monitoring service, so we might need to change the api upstream.	2022-10-04 08:33:39 +00:00
Michael Sproul	58bd2f76d0	Ensure protoc is installed for release CI (#3621 ) ## Issue Addressed The release CI is currently broken due to the addition of the `protoc` dependency. Here's a failure of the release flow running on my fork: https://github.com/michaelsproul/lighthouse/actions/runs/3155541478/jobs/5134317334 ## Proposed Changes - Install `protoc` on Windows and Mac so that it's available for `cargo install`. - Install an x86_64 binary in the Cross image for the aarch64 platform: we need a binary that runs on the host, _not_ on the target. - Fix `macos` local testnet CI by using the Github API key to dodge rate limiting (this issue: https://github.com/actions/runner-images/issues/602).	2022-10-03 23:09:25 +00:00
Marius Kjærstad	ff145b986f	Changed http:// to https:// on mailing list link (#3610 ) Changed http:// to https:// on mailing list link in README.md Co-authored-by: Michael Sproul <micsproul@gmail.com>	2022-09-29 06:13:35 +00:00
Michael Sproul	f77e3bc0ad	Add maxperf build profile (#3608 ) ## Proposed Changes Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time. Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using: ``` PROFILE=maxperf make install-lcli ``` ## Additional Info The downsides of the maxperf profile are: - It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s. As a result I think we should not enable this everywhere by default. - Option 1: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`. - Option 2: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions. Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.	2022-09-29 06:13:33 +00:00
tim gretler	8d325e700b	Use #!/usr/bin/env everywhere for local testnets (#3606 ) Full local testnet support for people that don't have `/bin/bash`	2022-09-29 06:13:30 +00:00
Age Manning	27bb9ff07d	Handle Lodestar's new agent string (#3620 ) ## Issue Addressed #3561 ## Proposed Changes Recognize Lodestars new agent string and appropriately count these peers as lodestar peers.	2022-09-29 01:50:13 +00:00
Age Manning	01b6bf7a2d	Improve logging a little (#3619 ) Some of the logs in combination with others could be improved. It will save some time debugging by improving the wording slightly.	2022-09-29 01:50:12 +00:00
Divma	b1d2510d1b	Libp2p v0.48.0 upgrade (#3547 ) ## Issue Addressed Upgrades libp2p to v.0.47.0. This is the compilation of - [x] #3495 - [x] #3497 - [x] #3491 - [x] #3546 - [x] #3553 Co-authored-by: Age Manning <Age@AgeManning.com>	2022-09-29 01:50:11 +00:00
Pawan Dhananjay	6779912fe4	Publish subscriptions to all beacon nodes (#3529 ) ## Issue Addressed Resolves #3516 ## Proposed Changes Adds a beacon fallback function for running a beacon node http query on all available fallbacks instead of returning on a first successful result. Uses the new `run_on_all` method for attestation and sync committee subscriptions. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-09-28 19:53:35 +00:00
Michael Sproul	abcebf276f	Add guide to MEV logs (#3611 ) ## Proposed Changes Add some docs on checking the builder configuration, which is a frequently asked question on Discord. ## Additional Info My text editor also insisted on stripping some trailing newlines, but can put 'em back if we want	2022-09-28 17:45:09 +00:00
Paul Hauner	01e84b71f5	v3.1.2 (#3603 ) ## Issue Addressed NA ## Proposed Changes Bump versions to v3.1.2 ## Additional Info - ~~Blocked on several PRs.~~ - ~~Requires further testing.~~	2022-09-26 01:17:36 +00:00
Divma	bd873e7162	New rust lints for rustc 1.64.0 (#3602 ) ## Issue Addressed fixes lints from the last rust release ## Proposed Changes Fix the lints, most of the lints by `clippy::question-mark` are false positives in the form of https://github.com/rust-lang/rust-clippy/issues/9518 so it's allowed for now ## Additional Info	2022-09-23 03:52:46 +00:00
Divma	9bd384a573	send attnet unsubscription event on random subnet expiry (#3600 ) ## Issue Addressed 🐞 in which we don't actually unsubscribe from a random long lived subnet when it expires ## Proposed Changes Remove code addressing a specific case in which we are subscribed to all subnets and handle the removal of the long lived subnet. I don't think the special case code is particularly important as, if someone is running with that many validators to be subscribed to all subnets, it should use `--subscribe-all-subnets` instead ## Additional Info Noticed on some test nodes climbing bandwidth usage periodically (around 27hours, the time of subnet expirations) I'm running this code to test this does not happen anymore, but I think it should be good now	2022-09-23 03:52:45 +00:00
Paul Hauner	9246a92d76	Make garbage collection test less failure prone (#3599 ) ## Issue Addressed NA ## Proposed Changes This PR attempts to fix the following spurious CI failure: ``` ---- store_tests::garbage_collect_temp_states_from_failed_block stdout ---- thread 'store_tests::garbage_collect_temp_states_from_failed_block' panicked at 'disk store should initialize: DBError { message: "Error { message: \"IO error: lock /tmp/.tmp6DcBQ9/cold_db/LOCK: already held by process\" }" }', beacon_node/beacon_chain/tests/store_tests.rs:59:10 ``` I believe that some async task is taking a clone of the store and holding it in some other thread for a short time. This creates a race-condition when we try to open a new instance of the store. ## Additional Info NA	2022-09-23 03:52:44 +00:00
Pawan Dhananjay	3a3dddc5fb	Fix ee integration tests (#3592 ) ## Issue Addressed Resolves #3573 ## Proposed Changes Fix the bytecode for the deposit contract deployment transaction and value for deposit transaction in the execution engine integration tests. Also verify that all published transaction make it to the execution payload and have a valid status.	2022-09-23 03:52:43 +00:00
Paul Hauner	fa6ad1a11a	Deduplicate block root computation (#3590 ) ## Issue Addressed NA ## Proposed Changes This PR removes duplicated block root computation. Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`. Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations after the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times. ## Additional Info NA	2022-09-23 03:52:42 +00:00
Ramana Kumar	76ba0a1aaf	Add disable-log-timestamp flag (#3101 ) (#3586 ) ## Issues Addressed Closes https://github.com/sigp/lighthouse/issues/3101 ## Proposed Changes Add global flag to suppress timestamps in the terminal logger.	2022-09-23 03:52:41 +00:00
Paul Hauner	3128b5b430	v3.1.1 (#3585 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info - ~~Requires additional testing~~ - ~~Blocked on:~~ - ~~#3589~~ - ~~#3540~~ - ~~#3587~~	2022-09-22 06:08:52 +00:00
Paul Hauner	dadbd69eec	Fix concurrency issue with `oneshot_broadcast` (#3596 ) ## Issue Addressed NA ## Proposed Changes Fixes an issue found during testing with #3595. ## Additional Info NA	2022-09-21 10:52:14 +00:00
Paul Hauner	96692b8e43	Impl `oneshot_broadcast` for committee promises (#3595 ) ## Issue Addressed NA ## Proposed Changes Fixes an issue introduced in #3574 where I erroneously assumed that a `crossbeam_channel` multiple receiver queue was a broadcast queue. This is incorrect, each message will be received by only one receiver. The effect of this mistake is these logs: ``` Sep 20 06:56:17.001 INFO Synced slot: 4736079, block: 0xaa8a…180d, epoch: 148002, finalized_epoch: 148000, finalized_root: 0x2775…47f2, exec_hash: 0x2ca5…ffde (verified), peers: 6, service: slot_notifier Sep 20 06:56:23.237 ERRO Unable to validate attestation error: CommitteeCacheWait(RecvError), peer_id: 16Uiu2HAm2Jnnj8868tb7hCta1rmkXUf5YjqUH1YPj35DCwNyeEzs, type: "aggregated", slot: Slot(4736047), beacon_block_root: 0x88d318534b1010e0ebd79aed60b6b6da1d70357d72b271c01adf55c2b46206c1 ``` ## Additional Info NA	2022-09-21 01:01:50 +00:00
Paul Hauner	a95bcba2ab	Avoid holding write-lock whilst waiting on shuffling cache promise (#3589 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug which hogged the write-lock for the `shuffling_cache`. ## Additional Info NA	2022-09-19 07:58:50 +00:00
Michael Sproul	507bb9dad4	Refined payload pruning (#3587 ) ## Proposed Changes Improve the payload pruning feature in several ways: - Payload pruning is now entirely optional. It is enabled by default but can be disabled with `--prune-payloads false`. The previous `--prune-payloads-on-startup` flag from #3565 is removed. - Initial payload pruning on startup now runs in a background thread. This thread will always load the split state, which is a small fraction of its total work (up to ~300ms) and then backtrack from that state. This pruning process ran in 2m5s on one Prater node with good I/O and 16m on a node with slower I/O. - To work with the optional payload pruning the database function `try_load_full_block` will now attempt to load execution payloads for finalized slots _if_ pruning is currently disabled. This gives users an opt-out for the extensive traffic between the CL and EL for reconstructing payloads. ## Additional Info If the `prune-payloads` flag is toggled on and off then the on-startup check may not see any payloads to delete and fail to clean them up. In this case the `lighthouse db prune_payloads` command should be used to force a manual sweep of the database.	2022-09-19 07:58:49 +00:00
Michael Sproul	f2ac0738d8	Implement `skip_randao_verification` and blinded block rewards API (#3540 ) ## Issue Addressed https://github.com/ethereum/beacon-APIs/pull/222 ## Proposed Changes Update Lighthouse's randao verification API to match the `beacon-APIs` spec. We implemented the API before spec stabilisation, and it changed slightly in the course of review. Rather than a flag `verify_randao` taking a boolean value, the new API uses a `skip_randao_verification` flag which takes no argument. The new spec also requires the randao reveal to be present and equal to the point-at-infinity when `skip_randao_verification` is set. I've also updated the `POST /lighthouse/analysis/block_rewards` API to take blinded blocks as input, as the execution payload is irrelevant and we may want to assess blocks produced by builders. ## Additional Info This is technically a breaking change, but seeing as I suspect I'm the only one using these parameters/APIs, I think we're OK to include this in a patch release.	2022-09-19 07:58:48 +00:00
Michael Sproul	ca42ef2e5a	Prune finalized execution payloads (#3565 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/3556 ## Proposed Changes Delete finalized execution payloads from the database in two places: 1. When running the finalization migration in `migrate_database`. We delete the finalized payloads between the last split point and the new updated split point. _If_ payloads are already pruned prior to this then this is sufficient to prune _all_ payloads as non-canonical payloads are already deleted by the head pruner, and all canonical payloads prior to the previous split will already have been pruned. 2. To address the fact that users will update to this code _after_ the merge on mainnet (and testnets), we need a one-off scan to delete the finalized payloads from the canonical chain. This is implemented in `try_prune_execution_payloads` which runs on startup and scans the chain back to the Bellatrix fork or the anchor slot (if checkpoint synced after Bellatrix). In the case where payloads are already pruned this check only imposes a single state load for the split state, which shouldn't be _too slow_. Even so, a flag `--prepare-payloads-on-startup=false` is provided to turn this off after it has run the first time, which provides faster start-up times. There is also a new `lighthouse db prune_payloads` subcommand for users who prefer to run the pruning manually. ## Additional Info The tests have been updated to not rely on finalized payloads in the database, instead using the `MockExecutionLayer` to reconstruct them. Additionally a check was added to `check_chain_dump` which asserts the non-existence or existence of payloads on disk depending on their slot.	2022-09-17 02:27:01 +00:00
Michael Sproul	5b2843c2cd	Pre-allocate vectors in SSZ decoding (#3417 ) ## Issue Addressed Fixes a potential regression in memory fragmentation identified by @paulhauner here: https://github.com/sigp/lighthouse/pull/3371#discussion_r931770045. ## Proposed Changes Immediately allocate a vector with sufficient size to hold all decoded elements in SSZ decoding. The `size_hint` is derived from the range iterator here: `2983235650/consensus/ssz/src/decode/impls.rs (L489)` ## Additional Info I'd like to test this out on some infra for a substantial duration to see if it affects total fragmentation.	2022-09-16 11:54:17 +00:00
Paul Hauner	b0b606dabe	Use `SmallVec` for `TreeHash` packed encoding (#3581 ) ## Issue Addressed NA ## Proposed Changes I've noticed that our block hashing times increase significantly after the merge. I did some flamegraph-ing and noticed that we're allocating a `Vec` for each byte of each execution payload transaction. This seems like unnecessary work and a bit of a fragmentation risk. This PR switches to `SmallVec<[u8; 32]>` for the packed encoding of `TreeHash`. I believe this is a nice simple optimisation with no downside. ### Benchmarking These numbers were computed using #3580 on my desktop (i7 hex-core). You can see a bit of noise in the numbers, that's probably just my computer doing other things. Generally I found this change takes the time from 10-11ms to 8-9ms. I can also see all the allocations disappear from flamegraph. This is the block being benchmarked: https://beaconcha.in/slot/4704236 #### Before ``` [2022-09-15T21:44:19Z INFO lcli::block_root] Run 980: 10.553003ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 981: 10.563737ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 982: 10.646352ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 983: 10.628532ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 984: 10.552112ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 985: 10.587778ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 986: 10.640526ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 987: 10.587243ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 988: 10.554748ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 989: 10.551111ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 990: 11.559031ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 991: 11.944827ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 992: 10.554308ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 993: 11.043397ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 994: 11.043315ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 995: 11.207711ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 996: 11.056246ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 997: 11.049706ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 998: 11.432449ms [2022-09-15T21:44:19Z INFO lcli::block_root] Run 999: 11.149617ms ``` #### After ``` [2022-09-15T21:41:49Z INFO lcli::block_root] Run 980: 14.011653ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 981: 8.925314ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 982: 8.849563ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 983: 8.893689ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 984: 8.902964ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 985: 8.942067ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 986: 8.907088ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 987: 9.346101ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 988: 8.96142ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 989: 9.366437ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 990: 9.809334ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 991: 9.541561ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 992: 11.143518ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 993: 10.821181ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 994: 9.855973ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 995: 10.941006ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 996: 9.596155ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 997: 9.121739ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 998: 9.090019ms [2022-09-15T21:41:49Z INFO lcli::block_root] Run 999: 9.071885ms ``` ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-09-16 08:54:06 +00:00
Paul Hauner	bde3c168e2	Add `lcli block-root` tool (#3580 ) ## Issue Addressed NA ## Proposed Changes Adds a simple tool for computing the block root of some block from a beacon-API or a file. This is useful for benchmarking. ## Additional Info NA	2022-09-16 08:54:04 +00:00
Paul Hauner	2cd3e3a768	Avoid duplicate committee cache loads (#3574 ) ## Issue Addressed NA ## Proposed Changes I have observed scenarios on Goerli where Lighthouse was receiving attestations which reference the same, un-cached shuffling on multiple threads at the same time. Lighthouse was then loading the same state from database and determining the shuffling on multiple threads at the same time. This is unnecessary load on the disk and RAM. This PR modifies the shuffling cache so that each entry can be either: - A committee - A promise for a committee (i.e., a `crossbeam_channel::Receiver`) Now, in the scenario where we have thread A and thread B simultaneously requesting the same un-cached shuffling, we will have the following: 1. Thread A will take the write-lock on the shuffling cache, find that there's no cached committee and then create a "promise" (a `crossbeam_channel::Sender`) for a committee before dropping the write-lock. 1. Thread B will then be allowed to take the write-lock for the shuffling cache and find the promise created by thread A. It will block the current thread waiting for thread A to fulfill that promise. 1. Thread A will load the state from disk, obtain the shuffling, send it down the channel, insert the entry into the cache and then continue to verify the attestation. 1. Thread B will then receive the shuffling from the receiver, be un-blocked and then continue to verify the attestation. In the case where thread A fails to generate the shuffling and drops the sender, the next time that specific shuffling is requested we will detect that the channel is disconnected and return a `None` entry for that shuffling. This will cause the shuffling to be re-calculated. ## Additional Info NA	2022-09-16 08:54:03 +00:00
Paul Hauner	7d3948c8fe	Add metric for re-org distance (#3566 ) ## Issue Addressed NA ## Proposed Changes Add a metric to track the re-org distance. ## Additional Info NA	2022-09-13 17:19:27 +00:00

1 2 3 4 5 ...

4734 Commits