lighthouse

Author	SHA1	Message	Date
realbigsean	a62e52f319	Single blob lookups (#4152 ) * some blob reprocessing work * remove ForceBlockLookup * reorder enum match arms in sync manager * a lot more reprocessing work * impl logic for triggerng blob lookups along with block lookups * deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker. * make single block lookup generic * more work * add delayed processing logic and combine some requests * start fixing some compile errors * fix compilation in main block lookup mod * much work * get things compiling * parent blob lookups * fix compile * revert red/stevie changes * fix up sync manager delay message logic * add peer usefulness enum * should remove lookup refactor * consolidate retry error handling * improve peer scoring during certain failures in parent lookups * improve retry code * drop parent lookup if either req has a peer disconnect during download * refactor single block processed method * processing peer refactor * smol bugfix * fix some todos * fix lints * fix lints * fix compile in lookup tests * fix lints * fix lints * fix existing block lookup tests * renamings * fix after merge * cargo fmt * compilation fix in beacon chain tests * fix * refactor lookup tests to work with multiple forks and response types * make tests into macros * wrap availability check error * fix compile after merge * add random blobs * start fixing up lookup verify error handling * some bug fixes and the start of deneb only tests * make tests work for all forks * track information about peer source * error refactoring * improve peer scoring * fix test compilation * make sure blobs are sent for processing after stream termination, delete copied tests * add some tests and fix a bug * smol bugfixes and moar tests * add tests and fix some things * compile after merge * lots of refactoring * retry on invalid block/blob * merge unknown parent messages before current slot lookup * get tests compiling * penalize blob peer on invalid blobs * Check disk on in-memory cache miss * Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs * Update beacon_node/network/src/sync/network_context.rs Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com> * fix bug in matching blocks and blobs in range sync * pr feedback * fix conflicts * upgrade logs from warn to crit when we receive incorrect response in range * synced_and_connected_within_tolerance -> should_search_for_block * remove todo * Fix Broken Overflow Tests * fix merge conflicts * checkpoint sync without alignment * add import * query for checkpoint state by slot rather than state root (teku doesn't serve by state root) * get state first and query by most recent block root * simplify delay logic * rename unknown parent sync message variants * rename parameter, block_slot -> slot * add some docs to the lookup module * use interval instead of sleep * drop request if blocks and blobs requests both return `None` for `Id` * clean up `find_single_lookup` logic * add lookup source enum * clean up `find_single_lookup` logic * add docs to find_single_lookup_request * move LookupSource our of param where unnecessary * remove unnecessary todo * query for block by `state.latest_block_header.slot` * fix lint * fix test * fix test * fix observed blob sidecars test * PR updates * use optional params instead of a closure * create lookup and trigger request in separate method calls * remove `LookupSource` * make sure duplicate lookups are not dropped --------- Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com> Co-authored-by: Mark Mackey <mark@sigmaprime.io> Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>	2023-06-15 12:59:10 -04:00
Diva M	3c1a22ceaf	Merge commit '1e029ce5384e911390a513e2d1885532f34a8b2b' into eip4844	2023-04-04 11:56:54 -05:00
Jimmy Chen	2de3451011	Rate limiting backfill sync (#3936 ) ## Issue Addressed #3212 ## Proposed Changes - Introduce a new `rate_limiting_backfill_queue` - any new inbound backfill work events gets immediately sent to this FIFO queue without any processing - Spawn a `backfill_scheduler` routine that pops a backfill event from the FIFO queue at specified intervals (currently halfway through a slot, or at 6s after slot start for 12s slots) and sends the event to `BeaconProcessor` via a `scheduled_backfill_work_tx` channel - This channel gets polled last in the `InboundEvents`, and work event received is wrapped in a `InboundEvent::ScheduledBackfillWork` enum variant, which gets processed immediately or queued by the `BeaconProcessor` (existing logic applies from here) Diagram comparing backfill processing with / without rate-limiting: https://github.com/sigp/lighthouse/issues/3212#issuecomment-1386249922 See this comment for @paulhauner's explanation and solution: https://github.com/sigp/lighthouse/issues/3212#issuecomment-1384674956 ## Additional Info I've compared this branch (with backfill processing rate limited to to 1 and 3 batches per slot) against the latest stable version. The CPU usage during backfill sync is reduced by ~5% - 20%, more details on this page: https://hackmd.io/@jimmygchen/SJuVpJL3j The above testing is done on Goerli (as I don't currently have hardware for Mainnet), I'm guessing the differences are likely to be bigger on mainnet due to block size. ### TODOs - [x] Experiment with processing multiple batches per slot. (need to think about how to do this for different slot durations) - [x] Add option to disable rate-limiting, enabed by default. - [x] (No longer required now we're reusing the reprocessing queue) Complete the `backfill_scheduler` task when backfill sync is completed or not required	2023-04-03 03:02:55 +00:00
Diva M	d93753cc88	Merge branch 'unstable' into off-4844	2023-03-02 15:38:00 -05:00
Jimmy Chen	245e922c7b	Improve testing slot clock to allow manipulation of time in tests (#3974 ) ## Issue Addressed I discovered this issue while implementing [this test](https://github.com/jimmygchen/lighthouse/blob/test-example/beacon_node/network/src/beacon_processor/tests.rs#L895), where I tried to manipulate the slot clock with: `rig.chain.slot_clock.set_current_time(duration);` however the change doesn't get reflected in the `slot_clock` in `ReprocessQueue`, and I realised `slot_clock` was cloned a few times in the code, and therefore changing the time in `rig.chain.slot_clock` doesn't have any effect in `ReprocessQueue`. I've incorporated the suggestion from the @paulhauner and @michaelsproul - wrapping the `ManualSlotClock.current_time` (`RwLock<Duration>)` in an `Arc`, and the above test now passes. Let's see if this breaks any existing tests :)	2023-02-16 23:34:32 +00:00
Emilia Hane	13efd47238	fixup! Disable use of system time in tests	2023-02-15 09:20:30 +01:00
Emilia Hane	e9e198a2b6	Fix conflicts rebasing eip4844	2023-02-10 09:41:23 +01:00
Emilia Hane	d292a3a6a8	Fix conflicts rebasing eip4844	2023-02-10 09:41:23 +01:00
Michael Sproul	4d0122444b	Update and consolidate dependencies (#3136 ) ## Proposed Changes I did some gardening 🌳 in our dependency tree: - Remove duplicate versions of `warp` (git vs patch) - Remove duplicate versions of lots of small deps: `cpufeatures`, `ethabi`, `ethereum-types`, `bitvec`, `nix`, `libsecp256k1`. - Update MDBX (should resolve #3028). I tested and Lighthouse compiles on Windows 11 now. - Restore `psutil` back to upstream - Make some progress updating everything to rand 0.8. There are a few crates stuck on 0.7. Hopefully this puts us on a better footing for future `cargo audit` issues, and improves compile times slightly. ## Additional Info Some crates are held back by issues with `zeroize`. libp2p-noise depends on [`chacha20poly1305`](https://crates.io/crates/chacha20poly1305) which depends on zeroize < v1.5, and we can only have one version of zeroize because it's post 1.0 (see https://github.com/rust-lang/cargo/issues/6584). The latest version of `zeroize` is v1.5.4, which is used by the new versions of many other crates (e.g. `num-bigint-dig`). Once a new version of chacha20poly1305 is released we can update libp2p-noise and upgrade everything to the latest `zeroize` version. I've also opened a PR to `blst` related to zeroize: https://github.com/supranational/blst/pull/111	2022-04-04 00:26:16 +00:00
Michael Sproul	5e1f8a8480	Update to Rust 1.59 and 2021 edition (#3038 ) ## Proposed Changes Lots of lint updates related to `flat_map`, `unwrap_or_else` and string patterns. I did a little more creative refactoring in the op pool, but otherwise followed Clippy's suggestions. ## Additional Info We need this PR to unblock CI.	2022-02-25 00:10:17 +00:00
Paul Hauner	61f60bdf03	Avoid penalizing peers for delays during processing (#2894 ) ## Issue Addressed NA ## Proposed Changes We have observed occasions were under-resourced nodes will receive messages that were valid at the time, but later become invalidated due to long waits for a `BeaconProcessor` worker. In this PR, we will check to see if the message was valid at the time of receipt. If it was initially valid but invalid now, we just ignore the message without penalizing the peer. ## Additional Info NA	2022-01-12 02:36:24 +00:00
realbigsean	b22ac95d7f	v1.1.6 Fork Choice changes (#2822 ) ## Issue Addressed Resolves: https://github.com/sigp/lighthouse/issues/2741 Includes: https://github.com/sigp/lighthouse/pull/2853 so that we can get ssz static tests passing here on v1.1.6. If we want to merge that first, we can make this diff slightly smaller ## Proposed Changes - Changes the `justified_epoch` and `finalized_epoch` in the `ProtoArrayNode` each to an `Option<Checkpoint>`. The `Option` is necessary only for the migration, so not ideal. But does allow us to add a default logic to `None` on these fields during the database migration. - Adds a database migration from a legacy fork choice struct to the new one, search for all necessary block roots in fork choice by iterating through blocks in the db. - updates related to https://github.com/ethereum/consensus-specs/pull/2727 - We will have to update the persisted forkchoice to make sure the justified checkpoint stored is correct according to the updated fork choice logic. This boils down to setting the forkchoice store's justified checkpoint to the justified checkpoint of the block that advanced the finalized checkpoint to the current one. - AFAICT there's no migration steps necessary for the update to allow applying attestations from prior blocks, but would appreciate confirmation on that - I updated the consensus spec tests to v1.1.6 here, but they will fail until we also implement the proposer score boost updates. I confirmed that the previously failing scenario `new_finalized_slot_is_justified_checkpoint_ancestor` will now pass after the boost updates, but haven't confirmed _all_ tests will pass because I just quickly stubbed out the proposer boost test scenario formatting. - This PR now also includes proposer boosting https://github.com/ethereum/consensus-specs/pull/2730 ## Additional Info I realized checking justified and finalized roots in fork choice makes it more likely that we trigger this bug: https://github.com/ethereum/consensus-specs/pull/2727 It's possible the combination of justified checkpoint and finalized checkpoint in the forkchoice store is different from in any block in fork choice. So when trying to startup our store's justified checkpoint seems invalid to the rest of fork choice (but it should be valid). When this happens we get an `InvalidBestNode` error and fail to start up. So I'm including that bugfix in this branch. Todo: - [x] Fix fork choice tests - [x] Self review - [x] Add fix for https://github.com/ethereum/consensus-specs/pull/2727 - [x] Rebase onto Kintusgi - [x] Fix `num_active_validators` calculation as @michaelsproul pointed out - [x] Clean up db migrations Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-13 20:43:22 +00:00
Paul Hauner	144978f8f8	Remove duplicate slot_clock method (#2842 )	2021-12-02 14:29:59 +11:00
ethDreamer	1563bce905	Finished Gossip Block Validation Conditions (#2640 ) * Gossip Block Validation is Much More Efficient Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-12-02 14:26:51 +11:00
Pawan Dhananjay	5a3bcd2904	Validator monitor support for sync committees (#2476 ) ## Issue Addressed N/A ## Proposed Changes Add functionality in the validator monitor to provide sync committee related metrics for monitored validators. Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-08-31 23:31:36 +00:00
Paul Hauner	c1203f5e52	Add specific log and metric for delayed blocks (#2308 ) ## Issue Addressed NA ## Proposed Changes - Adds a specific log and metric for when a block is enshrined as head with a delay that will caused bad attestations - We technically already expose this information, but it's a little tricky to determine during debugging. This makes it nice and explicit. - Fixes a minor reporting bug with the validator monitor where it was expecting agg. attestations too early (at half-slot rather than two-thirds-slot). ## Additional Info NA	2021-04-13 02:16:59 +00:00
Paul Hauner	a764c3b247	Handle early blocks (#2155 ) ## Issue Addressed NA ## Problem this PR addresses There's an issue where Lighthouse is banning a lot of peers due to the following sequence of events: 1. Gossip block 0xabc arrives ~200ms early - It is propagated across the network, with respect to [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#why-is-there-maximum_gossip_clock_disparity-when-validating-slot-ranges-of-messages-in-gossip-subnets). - However, it is not imported to our database since the block is early. 2. Attestations for 0xabc arrive, but the block was not imported. - The peer that sent the attestation is down-voted. - Each unknown-block attestation causes a score loss of 1, the peer is banned at -100. - When the peer is on an attestation subnet there can be hundreds of attestations, so the peer is banned quickly (before the missed block can be obtained via rpc). ## Potential solutions I can think of three solutions to this: 1. Wait for attestation-queuing (#635) to arrive and solve this. - Easy - Not immediate fix. - Whilst this would work, I don't think it's a perfect solution for this particular issue, rather (3) is better. 1. Allow importing blocks with a tolerance of `MAXIMUM_GOSSIP_CLOCK_DISPARITY`. - Easy - ~~I have implemented this, for now.~~ 1. If a block is verified for gossip propagation (i.e., signature verified) and it's within `MAXIMUM_GOSSIP_CLOCK_DISPARITY`, then queue it to be processed at the start of the appropriate slot. - More difficult - Feels like the best solution, I will try to implement this. This PR takes approach (3). ## Changes included - Implement the `block_delay_queue`, based upon a [`DelayQueue`](https://docs.rs/tokio-util/0.6.3/tokio_util/time/delay_queue/struct.DelayQueue.html) which can store blocks until it's time to import them. - Add a new `DelayedImportBlock` variant to the `beacon_processor::WorkEvent` enum to handle this new event. - In the `BeaconProcessor`, refactor a `tokio::select!` to a struct with an explicit `Stream` implementation. I experienced some issues with `tokio::select!` in the block delay queue and I also found it hard to debug. I think this explicit implementation is nicer and functionally equivalent (apart from the fact that `tokio::select!` randomly chooses futures to poll, whereas now we're deterministic). - Add a testing framework to the `beacon_processor` module that tests this new block delay logic. I also tested a handful of other operations in the beacon processor (attns, slashings, exits) since it was super easy to copy-pasta the code from the `http_api` tester. - To implement these tests I added the concept of an optional `work_journal_tx` to the `BeaconProcessor` which will spit out a log of events. I used this in the tests to ensure that things were happening as I expect. - The tests are a little racey, but it's hard to avoid that when testing timing-based code. If we see CI failures I can revise. I haven't observed any failures due to races on my machine or on CI yet. - To assist with testing I allowed for directly setting the time on the `ManualSlotClock`. - I gave the `beacon_processor::Worker` a `Toolbox` for two reasons; (a) it avoids changing tons of function sigs when you want to pass a new object to the worker and (b) it seemed cute.	2021-02-24 03:08:52 +00:00
Paul Hauner	2b2a358522	Detailed validator monitoring (#2151 ) ## Issue Addressed - Resolves #2064 ## Proposed Changes Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators. Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint. Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored. See the new docs included in this PR for more info. ## TODO - [x] Track validator balance, `slashed` status, etc. - [x] ~~Register slashings in current epoch, not offense epoch~~ - [ ] Publish Grafana dashboard, update TODO link in docs - [x] ~~#2130 is merged into this branch, resolve that~~	2021-01-20 19:19:38 +00:00
Paul Hauner	d9f940613f	Represent slots in secs instead of millisecs (#2163 ) ## Issue Addressed NA ## Proposed Changes Copied from #2083, changes the config milliseconds_per_slot to seconds_per_slot to avoid errors when slot duration is not a multiple of a second. To avoid deserializing old serialized data (with milliseconds instead of seconds) the Serialize and Deserialize derive got removed from the Spec struct (isn't currently used anyway). This PR replaces #2083 for the purpose of fixing a merge conflict without requiring the input of @blacktemplar. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2021-01-19 09:39:51 +00:00
Paul Hauner	cdec3cec18	Implement standard eth2.0 API (#1569 ) - Resolves #1550 - Resolves #824 - Resolves #825 - Resolves #1131 - Resolves #1411 - Resolves #1256 - Resolve #1177 - Includes the `ShufflingId` struct initially defined in #1492. That PR is now closed and the changes are included here, with significant bug fixes. - Implement the https://github.com/ethereum/eth2.0-APIs in a new `http_api` crate using `warp`. This replaces the `rest_api` crate. - Add a new `common/eth2` crate which provides a wrapper around `reqwest`, providing the HTTP client that is used by the validator client and for testing. This replaces the `common/remote_beacon_node` crate. - Create a `http_metrics` crate which is a dedicated server for Prometheus metrics (they are no longer served on the same port as the REST API). We now have flags for `--metrics`, `--metrics-address`, etc. - Allow the `subnet_id` to be an optional parameter for `VerifiedUnaggregatedAttestation::verify`. This means it does not need to be provided unnecessarily by the validator client. - Move `fn map_attestation_committee` in `mod beacon_chain::attestation_verification` to a new `fn with_committee_cache` on the `BeaconChain` so the same cache can be used for obtaining validator duties. - Add some other helpers to `BeaconChain` to assist with common API duties (e.g., `block_root_at_slot`, `head_beacon_block_root`). - Change the `NaiveAggregationPool` so it can index attestations by `hash_tree_root(attestation.data)`. This is a requirement of the API. - Add functions to `BeaconChainHarness` to allow it to create slashings and exits. - Allow for `eth1::Eth1NetworkId` to go to/from a `String`. - Add functions to the `OperationPool` to allow getting all objects in the pool. - Add function to `BeaconState` to check if a committee cache is initialized. - Fix bug where `seconds_per_eth1_block` was not transferring over from `YamlConfig` to `ChainSpec`. - Add the `deposit_contract_address` to `YamlConfig` and `ChainSpec`. We needed to be able to return it in an API response. - Change some uses of serde `serialize_with` and `deserialize_with` to a single use of `with` (code quality). - Impl `Display` and `FromStr` for several BLS fields. - Check for clock discrepancy when VC polls BN for sync state (with +/- 1 slot tolerance). This is not intended to be comprehensive, it was just easy to do. - See #1434 for a per-endpoint overview. - Seeking clarity here: https://github.com/ethereum/eth2.0-APIs/issues/75 - [x] Add docs for prom port to close #1256 - [x] Follow up on this #1177 - [x] ~~Follow up with #1424~~ Will fix in future PR. - [x] Follow up with #1411 - [x] ~~Follow up with #1260~~ Will fix in future PR. - [x] Add quotes to all integers. - [x] Remove `rest_types` - [x] Address missing beacon block error. (#1629) - [x] ~~Add tests for lighthouse/peers endpoints~~ Wontfix - [x] ~~Follow up with validator status proposal~~ Tracked in #1434 - [x] Unify graffiti structs - [x] ~~Start server when waiting for genesis?~~ Will fix in future PR. - [x] TODO in http_api tests - [x] Move lighthouse endpoints off /eth/v1 - [x] Update docs to link to standard - ~~Blocked on #1586~~ Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2020-10-01 11:12:36 +10:00
blacktemplar	23a8f31f83	Fix clippy warnings (#1385 ) ## Issue Addressed NA ## Proposed Changes Fixes most clippy warnings and ignores the rest of them, see issue #1388.	2020-07-23 14:18:00 +00:00
Paul Hauner	25cd91ce26	Update deps (#1322 ) * Run cargo update * Upgrade prometheus * Update hex * Upgrade parking-lot * Upgrade num-bigint * Upgrade sha2 * Update dockerfile Rust version * Run cargo update	2020-07-06 11:55:56 +10:00
Paul Hauner	764cb2d32a	v0.12 fork choice update (#1229 ) * Incomplete scraps * Add progress on new fork choice impl * Further progress * First complete compiling version * Remove chain reference * Add new lmd_ghost crate * Start integrating into beacon chain * Update `milagro_bls` to new release (#1183) * Update milagro_bls to new release Signed-off-by: Kirk Baird <baird.k@outlook.com> * Tidy up fake cryptos Signed-off-by: Kirk Baird <baird.k@outlook.com> * move SecretHash to bls and put plaintext back Signed-off-by: Kirk Baird <baird.k@outlook.com> * Update state processing for v0.12 * Fix EF test runners for v0.12 * Fix some tests * Fix broken attestation verification test * More test fixes * Rough beacon chain impl working * Remove fork_choice_2 * Remove checkpoint manager * Half finished ssz impl * Add missed file * Add persistence * Tidy, fix some compile errors * Remove RwLock from ProtoArrayForkChoice * Fix store-based compile errors * Add comments, tidy * Move function out of ForkChoice struct * Start testing * More testing * Fix compile error * Tidy beacon_chain::fork_choice * Queue attestations from the current slot * Allow fork choice to handle prior-to-genesis start * Improve error granularity * Test attestation dequeuing * Process attestations during block * Store target root in fork choice * Move fork choice verification into new crate * Update tests * Consensus updates for v0.12 (#1228) * Update state processing for v0.12 * Fix EF test runners for v0.12 * Fix some tests * Fix broken attestation verification test * More test fixes * Fix typo found in review * Add `Block` struct to ProtoArray * Start fixing get_ancestor * Add rough progress on testing * Get fork choice tests working * Progress with testing * Fix partialeq impl * Move slot clock from fc_store * Improve testing * Add testing for best justified * Add clone back to SystemTimeSlotClock * Add balances test * Start adding balances cache again * Wire-in balances cache * Improve tests * Remove commented-out tests * Remove beacon_chain::ForkChoice * Rename crates * Update wider codebase to new fork_choice layout * Move advance_slot in test harness * Tidy ForkChoice::update_time * Fix verification tests * Fix compile error with iter::once * Fix fork choice tests * Ensure block attestations are processed * Fix failing beacon_chain tests * Add first invalid block check * Add finalized block check * Progress with testing, new store builder * Add fixes to get_ancestor * Fix old genesis justification test * Fix remaining fork choice tests * Change root iteration method * Move on_verified_block * Remove unused method * Start adding attestation verification tests * Add invalid ffg target test * Add target epoch test * Add queued attestation test * Remove old fork choice verification tests * Tidy, add test * Move fork choice lock drop * Rename BeaconForkChoiceStore * Add comments, tidy BeaconForkChoiceStore * Update metrics, rename fork_choice_store.rs * Remove genesis_block_root from ForkChoice * Tidy * Update fork_choice comments * Tidy, add comments * Tidy, simplify ForkChoice, fix compile issue * Tidy, removed dead file * Increase http request timeout * Fix failing rest_api test * Set HTTP timeout back to 5s * Apply fix to get_ancestor * Address Michael's comments * Fix typo * Revert "Fix broken attestation verification test" This reverts commit 722cdc903b12611de27916a57eeecfa3224f2279. Co-authored-by: Kirk Baird <baird.k@outlook.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2020-06-17 11:10:22 +10:00
Paul Hauner	4331834003	Directory Restructure (#1163 ) * Move tests -> testing * Directory restructure * Update Cargo.toml during restructure * Update Makefile during restructure * Fix arbitrary path	2020-05-18 21:24:23 +10:00

24 Commits