Commit Graph

11 Commits

Author SHA1 Message Date
Paul Hauner
61f60bdf03 Avoid penalizing peers for delays during processing (#2894)
## Issue Addressed

NA

## Proposed Changes

We have observed occasions were under-resourced nodes will receive messages that were valid *at the time*, but later become invalidated due to long waits for a `BeaconProcessor` worker.

In this PR, we will check to see if the message was valid *at the time of receipt*. If it was initially valid but invalid now, we just ignore the message without penalizing the peer.

## Additional Info

NA
2022-01-12 02:36:24 +00:00
realbigsean
b22ac95d7f v1.1.6 Fork Choice changes (#2822)
## Issue Addressed

Resolves: https://github.com/sigp/lighthouse/issues/2741
Includes: https://github.com/sigp/lighthouse/pull/2853 so that we can get ssz static tests passing here on v1.1.6. If we want to merge that first, we can make this diff slightly smaller

## Proposed Changes

- Changes the `justified_epoch` and `finalized_epoch` in the `ProtoArrayNode` each to an `Option<Checkpoint>`. The `Option` is necessary only for the migration, so not ideal. But does allow us to add a default logic to `None` on these fields during the database migration.
- Adds a database migration from a legacy fork choice struct to the new one, search for all necessary block roots in fork choice by iterating through blocks in the db.
- updates related to https://github.com/ethereum/consensus-specs/pull/2727
  -  We will have to update the persisted forkchoice to make sure the justified checkpoint stored is correct according to the updated fork choice logic. This boils down to setting the forkchoice store's justified checkpoint to the justified checkpoint of the block that advanced the finalized checkpoint to the current one. 
  - AFAICT there's no migration steps necessary for the update to allow applying attestations from prior blocks, but would appreciate confirmation on that
- I updated the consensus spec tests to v1.1.6 here, but they will fail until we also implement the proposer score boost updates. I confirmed that the previously failing scenario `new_finalized_slot_is_justified_checkpoint_ancestor` will now pass after the boost updates, but haven't confirmed _all_ tests will pass because I just quickly stubbed out the proposer boost test scenario formatting.
- This PR now also includes proposer boosting https://github.com/ethereum/consensus-specs/pull/2730

## Additional Info
I realized checking justified and finalized roots in fork choice makes it more likely that we trigger this bug: https://github.com/ethereum/consensus-specs/pull/2727

It's possible the combination of justified checkpoint and finalized checkpoint in the forkchoice store is different from in any block in fork choice. So when trying to startup our store's justified checkpoint seems invalid to the rest of fork choice (but it should be valid). When this happens we get an `InvalidBestNode` error and fail to start up. So I'm including that bugfix in this branch.

Todo:

- [x] Fix fork choice tests
- [x] Self review
- [x] Add fix for https://github.com/ethereum/consensus-specs/pull/2727
- [x] Rebase onto Kintusgi 
- [x] Fix `num_active_validators` calculation as @michaelsproul pointed out
- [x] Clean up db migrations

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-12-13 20:43:22 +00:00
Paul Hauner
144978f8f8
Remove duplicate slot_clock method (#2842) 2021-12-02 14:29:59 +11:00
ethDreamer
1563bce905
Finished Gossip Block Validation Conditions (#2640)
* Gossip Block Validation is Much More Efficient

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-12-02 14:26:51 +11:00
Pawan Dhananjay
5a3bcd2904 Validator monitor support for sync committees (#2476)
## Issue Addressed

N/A

## Proposed Changes

Add functionality in the validator monitor to provide sync committee related metrics for monitored validators.


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-08-31 23:31:36 +00:00
Paul Hauner
c1203f5e52 Add specific log and metric for delayed blocks (#2308)
## Issue Addressed

NA

## Proposed Changes

- Adds a specific log and metric for when a block is enshrined as head with a delay that will caused bad attestations
    - We *technically* already expose this information, but it's a little tricky to determine during debugging. This makes it nice and explicit.
- Fixes a minor reporting bug with the validator monitor where it was expecting agg. attestations too early (at half-slot rather than two-thirds-slot).

## Additional Info

NA
2021-04-13 02:16:59 +00:00
Paul Hauner
a764c3b247 Handle early blocks (#2155)
## Issue Addressed

NA

## Problem this PR addresses

There's an issue where Lighthouse is banning a lot of peers due to the following sequence of events:

1. Gossip block 0xabc arrives ~200ms early
    - It is propagated across the network, with respect to [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#why-is-there-maximum_gossip_clock_disparity-when-validating-slot-ranges-of-messages-in-gossip-subnets).
    - However, it is not imported to our database since the block is early.
2. Attestations for 0xabc arrive, but the block was not imported.
    - The peer that sent the attestation is down-voted.
        - Each unknown-block attestation causes a score loss of 1, the peer is banned at -100.
        - When the peer is on an attestation subnet there can be hundreds of attestations, so the peer is banned quickly (before the missed block can be obtained via rpc).

## Potential solutions

I can think of three solutions to this:

1. Wait for attestation-queuing (#635) to arrive and solve this.
    - Easy
    - Not immediate fix.
    - Whilst this would work, I don't think it's a perfect solution for this particular issue, rather (3) is better.
1. Allow importing blocks with a tolerance of `MAXIMUM_GOSSIP_CLOCK_DISPARITY`.
    - Easy
    - ~~I have implemented this, for now.~~
1. If a block is verified for gossip propagation (i.e., signature verified) and it's within `MAXIMUM_GOSSIP_CLOCK_DISPARITY`, then queue it to be processed at the start of the appropriate slot.
    - More difficult
    - Feels like the best solution, I will try to implement this.
    
    
**This PR takes approach (3).**

## Changes included

- Implement the `block_delay_queue`, based upon a [`DelayQueue`](https://docs.rs/tokio-util/0.6.3/tokio_util/time/delay_queue/struct.DelayQueue.html) which can store blocks until it's time to import them.
- Add a new `DelayedImportBlock` variant to the `beacon_processor::WorkEvent` enum to handle this new event.
- In the `BeaconProcessor`, refactor a `tokio::select!` to a struct with an explicit `Stream` implementation. I experienced some issues with `tokio::select!` in the block delay queue and I also found it hard to debug. I think this explicit implementation is nicer and functionally equivalent (apart from the fact that `tokio::select!` randomly chooses futures to poll, whereas now we're deterministic).
- Add a testing framework to the `beacon_processor` module that tests this new block delay logic. I also tested a handful of other operations in the beacon processor (attns, slashings, exits) since it was super easy to copy-pasta the code from the `http_api` tester.
    - To implement these tests I added the concept of an optional `work_journal_tx` to the `BeaconProcessor` which will spit out a log of events. I used this in the tests to ensure that things were happening as I expect.
    - The tests are a little racey, but it's hard to avoid that when testing timing-based code. If we see CI failures I can revise. I haven't observed *any* failures due to races on my machine or on CI yet.
    - To assist with testing I allowed for directly setting the time on the `ManualSlotClock`.
- I gave the `beacon_processor::Worker` a `Toolbox` for two reasons; (a) it avoids changing tons of function sigs when you want to pass a new object to the worker and (b) it seemed cute.
2021-02-24 03:08:52 +00:00
Paul Hauner
2b2a358522 Detailed validator monitoring (#2151)
## Issue Addressed

- Resolves #2064

## Proposed Changes

Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators.

Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint.

Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored.

See the new docs included in this PR for more info.

## TODO

- [x] Track validator balance, `slashed` status, etc.
- [x] ~~Register slashings in current epoch, not offense epoch~~
- [ ] Publish Grafana dashboard, update TODO link in docs
- [x] ~~#2130 is merged into this branch, resolve that~~
2021-01-20 19:19:38 +00:00
Paul Hauner
cdec3cec18
Implement standard eth2.0 API (#1569)
- Resolves #1550
- Resolves #824
- Resolves #825
- Resolves #1131
- Resolves #1411
- Resolves #1256
- Resolve #1177

- Includes the `ShufflingId` struct initially defined in #1492. That PR is now closed and the changes are included here, with significant bug fixes.
- Implement the https://github.com/ethereum/eth2.0-APIs in a new `http_api` crate using `warp`. This replaces the `rest_api` crate.
- Add a new `common/eth2` crate which provides a wrapper around `reqwest`, providing the HTTP client that is used by the validator client and for testing. This replaces the `common/remote_beacon_node` crate.
- Create a `http_metrics` crate which is a dedicated server for Prometheus metrics (they are no longer served on the same port as the REST API). We now have flags for `--metrics`, `--metrics-address`, etc.
- Allow the `subnet_id` to be an optional parameter for `VerifiedUnaggregatedAttestation::verify`. This means it does not need to be provided unnecessarily by the validator client.
- Move `fn map_attestation_committee` in `mod beacon_chain::attestation_verification` to a new `fn with_committee_cache` on the `BeaconChain` so the same cache can be used for obtaining validator duties.
- Add some other helpers to `BeaconChain` to assist with common API duties (e.g., `block_root_at_slot`, `head_beacon_block_root`).
- Change the `NaiveAggregationPool` so it can index attestations by `hash_tree_root(attestation.data)`. This is a requirement of the API.
- Add functions to `BeaconChainHarness` to allow it to create slashings and exits.
- Allow for `eth1::Eth1NetworkId` to go to/from a `String`.
- Add functions to the `OperationPool` to allow getting all objects in the pool.
- Add function to `BeaconState` to check if a committee cache is initialized.
- Fix bug where `seconds_per_eth1_block` was not transferring over from `YamlConfig` to `ChainSpec`.
- Add the `deposit_contract_address` to `YamlConfig` and `ChainSpec`. We needed to be able to return it in an API response.
- Change some uses of serde `serialize_with` and `deserialize_with` to a single use of `with` (code quality).
- Impl `Display` and `FromStr` for several BLS fields.
- Check for clock discrepancy when VC polls BN for sync state (with +/- 1 slot tolerance). This is not intended to be comprehensive, it was just easy to do.

- See #1434 for a per-endpoint overview.
- Seeking clarity here: https://github.com/ethereum/eth2.0-APIs/issues/75

- [x] Add docs for prom port to close #1256
- [x] Follow up on this #1177
- [x] ~~Follow up with #1424~~ Will fix in future PR.
- [x] Follow up with #1411
- [x] ~~Follow up with  #1260~~ Will fix in future PR.
- [x] Add quotes to all integers.
- [x] Remove `rest_types`
- [x] Address missing beacon block error. (#1629)
- [x] ~~Add tests for lighthouse/peers endpoints~~ Wontfix
- [x] ~~Follow up with validator status proposal~~ Tracked in #1434
- [x] Unify graffiti structs
- [x] ~~Start server when waiting for genesis?~~ Will fix in future PR.
- [x] TODO in http_api tests
- [x] Move lighthouse endpoints off /eth/v1
- [x] Update docs to link to standard

- ~~Blocked on #1586~~

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2020-10-01 11:12:36 +10:00
Paul Hauner
764cb2d32a
v0.12 fork choice update (#1229)
* Incomplete scraps

* Add progress on new fork choice impl

* Further progress

* First complete compiling version

* Remove chain reference

* Add new lmd_ghost crate

* Start integrating into beacon chain

* Update `milagro_bls` to new release (#1183)

* Update milagro_bls to new release

Signed-off-by: Kirk Baird <baird.k@outlook.com>

* Tidy up fake cryptos

Signed-off-by: Kirk Baird <baird.k@outlook.com>

* move SecretHash to bls and put plaintext back

Signed-off-by: Kirk Baird <baird.k@outlook.com>

* Update state processing for v0.12

* Fix EF test runners for v0.12

* Fix some tests

* Fix broken attestation verification test

* More test fixes

* Rough beacon chain impl working

* Remove fork_choice_2

* Remove checkpoint manager

* Half finished ssz impl

* Add missed file

* Add persistence

* Tidy, fix some compile errors

* Remove RwLock from ProtoArrayForkChoice

* Fix store-based compile errors

* Add comments, tidy

* Move function out of ForkChoice struct

* Start testing

* More testing

* Fix compile error

* Tidy beacon_chain::fork_choice

* Queue attestations from the current slot

* Allow fork choice to handle prior-to-genesis start

* Improve error granularity

* Test attestation dequeuing

* Process attestations during block

* Store target root in fork choice

* Move fork choice verification into new crate

* Update tests

* Consensus updates for v0.12 (#1228)

* Update state processing for v0.12

* Fix EF test runners for v0.12

* Fix some tests

* Fix broken attestation verification test

* More test fixes

* Fix typo found in review

* Add `Block` struct to ProtoArray

* Start fixing get_ancestor

* Add rough progress on testing

* Get fork choice tests working

* Progress with testing

* Fix partialeq impl

* Move slot clock from fc_store

* Improve testing

* Add testing for best justified

* Add clone back to SystemTimeSlotClock

* Add balances test

* Start adding balances cache again

* Wire-in balances cache

* Improve tests

* Remove commented-out tests

* Remove beacon_chain::ForkChoice

* Rename crates

* Update wider codebase to new fork_choice layout

* Move advance_slot in test harness

* Tidy ForkChoice::update_time

* Fix verification tests

* Fix compile error with iter::once

* Fix fork choice tests

* Ensure block attestations are processed

* Fix failing beacon_chain tests

* Add first invalid block check

* Add finalized block check

* Progress with testing, new store builder

* Add fixes to get_ancestor

* Fix old genesis justification test

* Fix remaining fork choice tests

* Change root iteration method

* Move on_verified_block

* Remove unused method

* Start adding attestation verification tests

* Add invalid ffg target test

* Add target epoch test

* Add queued attestation test

* Remove old fork choice verification tests

* Tidy, add test

* Move fork choice lock drop

* Rename BeaconForkChoiceStore

* Add comments, tidy BeaconForkChoiceStore

* Update metrics, rename fork_choice_store.rs

* Remove genesis_block_root from ForkChoice

* Tidy

* Update fork_choice comments

* Tidy, add comments

* Tidy, simplify ForkChoice, fix compile issue

* Tidy, removed dead file

* Increase http request timeout

* Fix failing rest_api test

* Set HTTP timeout back to 5s

* Apply fix to get_ancestor

* Address Michael's comments

* Fix typo

* Revert "Fix broken attestation verification test"

This reverts commit 722cdc903b12611de27916a57eeecfa3224f2279.

Co-authored-by: Kirk Baird <baird.k@outlook.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2020-06-17 11:10:22 +10:00
Paul Hauner
4331834003
Directory Restructure (#1163)
* Move tests -> testing

* Directory restructure

* Update Cargo.toml during restructure

* Update Makefile during restructure

* Fix arbitrary path
2020-05-18 21:24:23 +10:00