Commit Graph

27 Commits

Author SHA1 Message Date
Paul Hauner
be79f74c6d
Fix IncorrectAttestationSource error in attn. simulator (#5048)
* Don't produce attestations when syncing

* Handle `IncorrectAttestationSource`
2024-01-10 10:47:00 +11:00
Michael Sproul
f1113540d8
Fix off-by-one bug in missed block detection (#5012)
* Regression test

* Fix the bug
2023-12-15 14:23:54 +11:00
Joel Rousseau
189430a45c
Add attestation simulator (#4880)
* basic scaffold

* remove unnecessary ?

* check if committee cache is init

* typed ValidatorMonitor with ethspecs + store attestations within

* nits

* process unaggregated attestation

* typo

* extract in func

* add tests

* better naming

* better naming 2

* less verbose

* use same naming as validator monitor

* use attestation_simulator

* add metrics

* remove cache

* refacto flag_indices process

* add lag

* remove copying state

* clean and lint

* extract metrics

* nits

* compare prom metrics in tests

* implement lag

* nits

* nits

* add attestation simulator service

* fmt

* return beacon_chain as arc

* nit: debug

* sed s/unaggregated/unagg.//

* fmt

* fmt

* nit: remove unused comments

* increase max unaggregated attestation hashmap to 64

* nit: sed s/clone/copied//

* improve perf: remove unecessary hashmap copy

* fix flag indices comp

* start service in client builder

* remove //

* cargo fmt

* lint

* cloned keys

* fmt

* use Slot value instead of pointer

* Update beacon_node/beacon_chain/src/attestation_simulator.rs

Co-authored-by: Paul Hauner <paul@paulhauner.com>

---------

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-12-14 11:44:56 +11:00
Michael Sproul
d04e361129
Fix missed block logs (#4922) 2023-11-16 22:07:58 +11:00
Joel Rousseau
ac8811afac
Add missed blocks to monitored validators (#4731)
* add missed_block metric

* init missed_block in constructor

* declare beaconproposercache in ValidatorMonitor

* refacto proposer_shuffling_decision_root to use epoch instead of current.epoch

* imple new proposer_shuffling_decision_root in callers

* push missed_blocks

* prune missed_blocks

* only add to hashmap if it's a monitored validator

* remove current_epoch dup + typos

* extract in func

* add prom metrics

* checkpoint is not only epoch but slot as well

* add safeguard if we start a new chain at slot 0

* clean

* remove unnecessary negative value for a slot

* typo in comment

* remove unused current_epoch

* share beacon_proposer_cache between validator_monitor and beacon_chain

* pass Hash256::zero()

* debug objects

* fix loop: lag is at the head

* sed s/get_slot/get_epoch

* fewer calls to cache.get_epoch

* fix typos

* remove cache first call

* export TYPICAL_SLOTS_PER_EPOCH and use it in validator_monitor

* switch to gauge & loop over missed_blocks hashset

* fix subnet_service tests

* remove unused var

* clean + fix nits

* add beacon_proposer_cache + validator_monitor in builder

* fix store_tests

* fix builder tests

* add tests

* add validator monitor set of tests

* clean tests

* nits

* optimise imports

* lint

* typo

* added self.aggregatable

* duplicate proposer_shuffling_decision_root

* remove duplication in passing beacon_proposer_cache

* remove duplication in passing beacon_proposer_cache

* using indices

* fmt

* implement missed blocks total

* nits

* avoid heap allocation

* remove recursion limit

* fix lint

* Fix valdiator monitor builder pattern

Unify validator monitor config struct

* renaming metrics

* renaming metrics in validator monitor

* add log if there's a missing validator index

* consistent log

* fix loop

* better loop

* move gauge to counter

* fmt

* add error message

* lint

* fix prom metrics

* set gauge to 0 when non-finalized epochs

* better wording

* remove hash256::zero in favour of block_root

* fix gauge total label

* fix last missed block validator

* Add `MissedBlock` struct

* Fix comment

* Refactor non-finalized block loop

* Fix off-by-one

* Avoid string allocation

* Fix compile error

* Remove non-finalized blocks metric

* fix func clojure

* remove unused variable

* remove unused DEFAULT_INDIVIDUAL_TRACKING_THRESHOLD

* remove unused DEFAULT_INDIVIDUAL_TRACKING_THRESHOLD in builder

* add validator index depending on the fork name

* typos

---------

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-11-09 15:05:14 +11:00
int88
90d562b3d4 add attestation inclusion distance in http api (#4148)
## Issue Addressed

#4097

## Proposed Changes

Add attestation inclusion distance in http api, extend `/lighthouse/ui/validator_metrics` to include it.

## Usage
```
curl -X POST "http://localhost:8001/lighthouse/ui/validator_metrics" -d '{"indices": [1]}' -H "Content-Type: application/json" | jq
```

```
{
  "data": {
    "validators": {
      "1": {
        "attestation_hits": 3,
        "attestation_misses": 1,
        "attestation_hit_percentage": 75,
        "attestation_head_hits": 3,
        "attestation_head_misses": 0,
        "attestation_head_hit_percentage": 100,
        "attestation_target_hits": 3,
        "attestation_target_misses": 0,
        "attestation_target_hit_percentage": 100,
        "attestation_inclusion_distance": 1
      }
    }
  }
}
```

## Additional Info

NA
2023-04-26 01:12:35 +00:00
Paul Hauner
c3e5053612 Reduce false positive logging for late builder blocks (#4073)
## Issue Addressed

NA

## Proposed Changes

When producing a block from a builder, there are two points where we could consider the block "broadcast":

1. When the blinded block is published to the builder.
2. When the un-blinded block is published to the P2P network (this is always *after* the previous step).

Our logging for late block broadcasts was using (2) for builder-blocks, which was creating a lot of false-positive logs. This is because the builder publishes the block on the P2P network themselves before returning it to us and we perform (2). For clarity, the logs were false-positives because we claim that the block was published late by us when it was actually published earlier by the builder.

This PR changes our logging behavior so we do our logging at (1) instead. It also updates our metrics for block broadcast to distinguish between local and builder blocks. I believe the metrics change will be natively compatible with existing Grafana dashboards.

## Additional Info

One could argue that the builder *should* return the block to us faster, however that's not the case. I think it's more important that we don't desensitize users with false-positives.
2023-03-17 00:44:03 +00:00
Mac L
3642efe76a Cache validator balances and allow them to be served over the HTTP API (#3863)
## Issue Addressed

#3804

## Proposed Changes

- Add `total_balance` to the validator monitor and adjust the number of historical epochs which are cached. 
- Allow certain values in the cache to be served out via the HTTP API without requiring a state read.

## Usage
```
curl -X POST "http://localhost:5052/lighthouse/ui/validator_info" -d '{"indices": [0]}' -H "Content-Type: application/json" | jq
```

```
{
  "data": {
    "validators": {
      "0": {
        "info": [
          {
            "epoch": 172981,
            "total_balance": 36566388519
          },
         ...
          {
            "epoch": 172990,
            "total_balance": 36566496513
          }
        ]
      },
      "1": {
        "info": [
          {
            "epoch": 172981,
            "total_balance": 36355797968
          },
          ...
          {
            "epoch": 172990,
            "total_balance": 36355905962
          }
        ]
      }
    }
  }
}
```

## Additional Info

This requires no historical states to operate which mean it will still function on the freshly checkpoint synced node, however because of this, the values will populate each epoch (up to a maximum of 10 entries).

Another benefit of this method, is that we can easily cache any other values which would normally require a state read and serve them via the same endpoint. However, we would need be cautious about not overly increasing block processing time by caching values from complex computations.

This also caches some of the validator metrics directly, rather than pulling them from the Prometheus metrics when the API is called. This means when the validator count exceeds the individual monitor threshold, the cached values will still be available.

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-02-21 20:54:55 +00:00
Paul Hauner
830efdb5c2 Improve validator monitor experience for high validator counts (#3728)
## Issue Addressed

NA

## Proposed Changes

Myself and others (#3678) have observed  that when running with lots of validators (e.g., 1000s) the cardinality is too much for Prometheus. I've seen Prometheus instances just grind to a halt when we turn the validator monitor on for our testnet validators (we have 10,000s of Goerli validators). Additionally, the debug log volume can get very high with one log per validator, per attestation.

To address this, the `bn --validator-monitor-individual-tracking-threshold <INTEGER>` flag has been added to *disable* per-validator (i.e., non-aggregated) metrics/logging once the validator monitor exceeds the threshold of validators. The default value is `64`, which is a finger-to-the-wind value. I don't actually know the value at which Prometheus starts to become overwhelmed, but I've seen it work with ~64 validators and I've seen it *not* work with 1000s of validators. A default of `64` seems like it will result in a breaking change to users who are running millions of dollars worth of validators whilst resulting in a no-op for low-validator-count users. I'm open to changing this number, though.

Additionally, this PR starts collecting aggregated Prometheus metrics (e.g., total count of head hits across all validators), so that high-validator-count validators still have some interesting metrics. We already had logging for aggregated values, so nothing has been added there.

I've opted to make this a breaking change since it can be rather damaging to your Prometheus instance to accidentally enable the validator monitor with large numbers of validators. I've crashed a Prometheus instance myself and had a report from another user who's done the same thing.

## Additional Info

NA

## Breaking Changes Note

A new label has been added to the validator monitor Prometheus metrics: `total`. This label tracks the aggregated metrics of all validators in the validator monitor (as opposed to each validator being tracking individually using its pubkey as the label).

Additionally, a new flag has been added to the Beacon Node: `--validator-monitor-individual-tracking-threshold`. The default value is `64`, which means that when the validator monitor is tracking more than 64 validators then it will stop tracking per-validator metrics and only track the `all_validators` metric. It will also stop logging per-validator logs and only emit aggregated logs (the exception being that exit and slashing logs are always emitted).

These changes were introduced in #3728 to address issues with untenable Prometheus cardinality and log volume when using the validator monitor with high validator counts (e.g., 1000s of validators). Users with less than 65 validators will see no change in behavior (apart from the added `all_validators` metric). Users with more than 65 validators who wish to maintain the previous behavior can set something like `--validator-monitor-individual-tracking-threshold 999999`.
2023-01-09 08:18:55 +00:00
Divma
ffbf70e2d9 Clippy lints for rust 1.66 (#3810)
## Issue Addressed
Fixes the new clippy lints for rust 1.66

## Proposed Changes

Most of the changes come from:
- [unnecessary_cast](https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast)
- [iter_kv_map](https://rust-lang.github.io/rust-clippy/master/index.html#iter_kv_map)
- [needless_borrow](https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow)

## Additional Info

na
2022-12-16 04:04:00 +00:00
Paul Hauner
6f79263a21 Make all validator monitor logs INFO (#3727)
## Issue Addressed

NA

## Proposed Changes

This is a *potentially* contentious change, but I find it annoying that the validator monitor logs `WARN` and `ERRO` for imperfect attestations. Perfect attestation performance is unachievable (don't believe those photo-shopped beauty magazines!) since missed and poorly-packed blocks by other validators will reduce your performance.

When the validator monitor is on with 10s or more validators, I find the logs are washed out with ERROs that are not worth investigating. I suspect that users who really want to know if validators are missing attestations can do so by matching the content of the log, rather than the log level.

I'm open to feedback about this, especially from anyone who is relying on the current log levels.

## Additional Info

NA

## Breaking Changes Notes

The validator monitor will no longer emit `WARN` and `ERRO` logs for sub-optimal attestation performance. The logs will now be emitted at `INFO` level. This change was introduced to avoid cluttering the `WARN` and `ERRO` logs with alerts that are frequently triggered by the actions of other network participants (e.g., a missed block) and require no action from the user.
2022-12-13 06:24:52 +00:00
Mac L
8cb9b5e126 Expose certain validator_monitor metrics to the HTTP API (#3760)
## Issue Addressed

#3724 

## Proposed Changes

Exposes certain `validator_monitor` as an endpoint on the HTTP API. Will only return metrics for validators which are actively being monitored.

### Usage

```bash
curl -X GET "http://localhost:5052/lighthouse/ui/validator_metrics" -H "accept: application/json" | jq
```

```json
{
  "data": {
    "validators": {
      "12345": {
        "attestation_hits": 10,
        "attestation_misses": 0,
        "attestation_hit_percentage": 100,
        "attestation_head_hits": 10,
        "attestation_head_misses": 0,
        "attestation_head_hit_percentage": 100,
        "attestation_target_hits": 5,
        "attestation_target_misses": 5,
        "attestation_target_hit_percentage": 50 
      }
    }
  }
}
```

## Additional Info

Based on #3756 which should be merged first.
2022-12-09 06:39:19 +00:00
tim gretler
266d765285 Register blocks in validator monitor (#3635)
## Issue Addressed

Closes #3460

## Proposed Changes

`blocks` and `block_min_delay` are never updated in the epoch summary



Co-authored-by: Michael Sproul <micsproul@gmail.com>
2022-11-09 05:37:09 +00:00
Divma
8600645f65 Fix rust 1.65 lints (#3682)
## Issue Addressed

New lints for rust 1.65

## Proposed Changes

Notable change is the identification or parameters that are only used in recursion

## Additional Info
na
2022-11-04 07:43:43 +00:00
eklm
a9e158663b Fix validator_monitor_prev_epoch_ metrics (#2911)
## Issue Addressed

#2820

## Proposed Changes

The problem is that validator_monitor_prev_epoch metrics are updated only if there is EpochSummary present in summaries map for the previous epoch and it is not the case for the offline validator. Ensure that EpochSummary is inserted into summaries map also for the offline validators.
2022-06-20 04:06:30 +00:00
Michael Sproul
5e1f8a8480 Update to Rust 1.59 and 2021 edition (#3038)
## Proposed Changes

Lots of lint updates related to `flat_map`, `unwrap_or_else` and string patterns. I did a little more creative refactoring in the op pool, but otherwise followed Clippy's suggestions.

## Additional Info

We need this PR to unblock CI.
2022-02-25 00:10:17 +00:00
Pawan Dhananjay
34d22b5920 Reduce validator monitor logging verbosity (#2606)
## Issue Addressed

Resolves #2541

## Proposed Changes

Reduces verbosity of validator monitor per epoch logging by batching info logs for multiple validators.

Instead of a log for every validator managed by the validator monitor, we now batch logs for attestation records for previous epoch.

Before:
```log
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 1, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 2, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 3, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 4, epoch: 65875, matched_head: true, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 5, epoch: 65875, matched_head: false, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon
Sep 20 06:53:08.239 WARN Attestation failed to match head        validator: 5, epoch: 65875, service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 6, epoch: 65875, matched_head: false, matched_target: true, inclusion_lag: 0 slot(s), service: val_mon
Sep 20 06:53:08.239 WARN Attestation failed to match head        validator: 6, epoch: 65875, service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 7, epoch: 65875, matched_head: true, matched_target: false, inclusion_lag: 1 slot(s), service: val_mon
Sep 20 06:53:08.239 WARN Attestation failed to match target      validator: 7, epoch: 65875, service: val_mon
Sep 20 06:53:08.239 WARN Sub-optimal inclusion delay             validator: 7, epoch: 65875, optimal: 1, delay: 2, service: val_mon
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validator: 8, epoch: 65875, matched_head: true, matched_target: false, inclusion_lag: 1 slot(s), service: val_mon
Sep 20 06:53:08.239 WARN Attestation failed to match target      validator: 8, epoch: 65875, service: val_mon
Sep 20 06:53:08.239 WARN Sub-optimal inclusion delay             validator: 8, epoch: 65875, optimal: 1, delay: 2, service: val_mon
Sep 20 06:53:08.239 ERRO Previous epoch attestation missing      validator: 9, epoch: 65875, service: val_mon
Sep 20 06:53:08.239 ERRO Previous epoch attestation missing      validator: 10, epoch: 65875, service: val_mon
```

after
```
Sep 20 06:53:08.239 INFO Previous epoch attestation success      validators: [1,2,3,4,5,6,7,8,9] , epoch: 65875, service: val_mon
Sep 20 06:53:08.239 WARN Previous epoch attestation failed to match head, validators: [5,6], epoch: 65875, service: val_mon
Sep 20 06:53:08.239 WARN Previous epoch attestation failed to match target, validators: [7,8], epoch: 65875, service: val_mon
Sep 20 06:53:08.239 WARN Previous epoch attestations had sub-optimal inclusion delay, validators: [7,8], epoch: 65875, service: val_mon
Sep 20 06:53:08.239 ERRO Previous epoch attestation missing      validators: [9,10], epoch: 65875, service: val_mon
```

The detailed individual logs are downgraded to debug logs.
2021-10-12 05:06:48 +00:00
Pawan Dhananjay
70441aa554 Improve valmon inclusion delay calculation (#2618)
## Issue Addressed

Resolves #2552 

## Proposed Changes

Offers some improvement in inclusion distance calculation in the validator monitor. 

When registering an attestation from a block, instead of doing `block.slot() - attesstation.data.slot()` to get the inclusion distance, we now pass the parent block slot from the beacon chain and do `parent_slot.saturating_sub(attestation.data.slot())`. This allows us to give best effort inclusion distance in scenarios where the attestation was included right after a skip slot. Note that this does not give accurate results in scenarios where the attestation was included few blocks after the skip slot.

In this case, if the attestation slot was `b1` and was included in block `b2` with a skip slot in between, we would get the inclusion delay as 0  (by ignoring the skip slot) which is the best effort inclusion delay.
```
b1 <- missed <- b2
``` 

Here, if the attestation slot was `b1` and was included in block `b3` with a skip slot and valid block `b2` in between, then we would get the inclusion delay as 2 instead of 1 (by ignoring the skip slot).
```
b1 <- missed <- b2 <- b3 
```
A solution for the scenario 2 would be to count number of slots between included slot and attestation slot ignoring the skip slots in the beacon chain and pass the value to the validator monitor. But I'm concerned that it could potentially lead to db accesses for older blocks in extreme cases.


This PR also uses the validator monitor data for logging per epoch inclusion distance. This is useful as we won't get inclusion data in post-altair summaries.


Co-authored-by: Michael Sproul <micsproul@gmail.com>
2021-09-30 01:22:43 +00:00
Pawan Dhananjay
5a3bcd2904 Validator monitor support for sync committees (#2476)
## Issue Addressed

N/A

## Proposed Changes

Add functionality in the validator monitor to provide sync committee related metrics for monitored validators.


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-08-31 23:31:36 +00:00
realbigsean
303deb9969 Rust 1.54.0 lints (#2483)
## Issue Addressed

N/A

## Proposed Changes

- Removing a bunch of unnecessary references
- Updated `Error::VariantError` to `Error::Variant`
- There were additional enum variant lints that I ignored, because I thought our variant names were fine
- removed `MonitoredValidator`'s `pubkey` field, because I couldn't find it used anywhere. It looks like we just use the string version of the pubkey (the `id` field) if there is no index

## Additional Info



Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-30 01:11:47 +00:00
Paul Hauner
6e3ca48cb9 Cache participating indices for Altair epoch processing (#2416)
## Issue Addressed

NA

## Proposed Changes

This PR addresses two things:

1. Allows the `ValidatorMonitor` to work with Altair states.
1. Optimizes `altair::process_epoch` (see [code](https://github.com/paulhauner/lighthouse/blob/participation-cache/consensus/state_processing/src/per_epoch_processing/altair/participation_cache.rs) for description)

## Breaking Changes

The breaking changes in this PR revolve around one premise:

*After the Altair fork, it's not longer possible (given only a `BeaconState`) to identify if a validator had *any* attestation included during some epoch. The best we can do is see if that validator made the "timely" source/target/head flags.*

Whilst this seems annoying, it's not actually too bad. Finalization is based upon "timely target" attestations, so that's really the most important thing. Although there's *some* value in knowing if a validator had *any* attestation included, it's far more important to know about "timely target" participation, since this is what affects finality and justification.

For simplicity and consistency, I've also removed the ability to determine if *any* attestation was included from metrics and API endpoints. Now, all Altair and non-Altair states will simply report on the head/target attestations.

The following section details where we've removed fields and provides replacement values.

### Breaking Changes: Prometheus Metrics

Some participation metrics have been removed and replaced. Some were removed since they are no longer relevant to Altair (e.g., total attesting balance) and others replaced with gwei values instead of pre-computed values. This provides more flexibility at display-time (e.g., Grafana).

The following metrics were added as replacements:

- `beacon_participation_prev_epoch_head_attesting_gwei_total`
- `beacon_participation_prev_epoch_target_attesting_gwei_total`
- `beacon_participation_prev_epoch_source_attesting_gwei_total`
- `beacon_participation_prev_epoch_active_gwei_total`

The following metrics were removed:

- `beacon_participation_prev_epoch_attester`
   - instead use `beacon_participation_prev_epoch_source_attesting_gwei_total / beacon_participation_prev_epoch_active_gwei_total`.
- `beacon_participation_prev_epoch_target_attester`
   - instead use `beacon_participation_prev_epoch_target_attesting_gwei_total / beacon_participation_prev_epoch_active_gwei_total`.
- `beacon_participation_prev_epoch_head_attester`
   - instead use `beacon_participation_prev_epoch_head_attesting_gwei_total / beacon_participation_prev_epoch_active_gwei_total`.

The `beacon_participation_prev_epoch_attester` endpoint has been removed. Users should instead use the pre-existing `beacon_participation_prev_epoch_target_attester`. 

### Breaking Changes: HTTP API

The `/lighthouse/validator_inclusion/{epoch}/{validator_id}` endpoint loses the following fields:

- `current_epoch_attesting_gwei` (use `current_epoch_target_attesting_gwei` instead)
- `previous_epoch_attesting_gwei` (use `previous_epoch_target_attesting_gwei` instead)

The `/lighthouse/validator_inclusion/{epoch}/{validator_id}` endpoint lose the following fields:

- `is_current_epoch_attester` (use `is_current_epoch_target_attester` instead)
- `is_previous_epoch_attester` (use `is_previous_epoch_target_attester` instead)
- `is_active_in_current_epoch` becomes `is_active_unslashed_in_current_epoch`.
- `is_active_in_previous_epoch` becomes `is_active_unslashed_in_previous_epoch`.

## Additional Info

NA

## TODO

- [x] Deal with total balances
- [x] Update validator_inclusion API
- [ ] Ensure `beacon_participation_prev_epoch_target_attester` and `beacon_participation_prev_epoch_head_attester` work before Altair

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-27 07:01:01 +00:00
Michael Sproul
b4689e20c6 Altair consensus changes and refactors (#2279)
## Proposed Changes

Implement the consensus changes necessary for the upcoming Altair hard fork.

## Additional Info

This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`.


Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-09 06:15:32 +00:00
Michael Sproul
3dc1eb5eb6 Ignore inactive validators in validator monitor (#2396)
## Proposed Changes

A user on Discord (`@ChewsMacRibs`) reported that the validator monitor was logging `WARN Attested to an incorrect head` for their validator while it was awaiting activation.

This PR modifies the monitor so that it ignores inactive validators, by the logic that they are either awaiting activation, or have already exited. Either way, there's no way for an inactive validator to have their attestations included on chain, so no need for the monitor to report on them.

## Additional Info

To reproduce the bug requires registering validator keys manually with `--validator-monitor-pubkeys`. I don't think the bug will present itself with `--validator-monitor-auto`.
2021-06-17 02:10:48 +00:00
Paul Hauner
c1203f5e52 Add specific log and metric for delayed blocks (#2308)
## Issue Addressed

NA

## Proposed Changes

- Adds a specific log and metric for when a block is enshrined as head with a delay that will caused bad attestations
    - We *technically* already expose this information, but it's a little tricky to determine during debugging. This makes it nice and explicit.
- Fixes a minor reporting bug with the validator monitor where it was expecting agg. attestations too early (at half-slot rather than two-thirds-slot).

## Additional Info

NA
2021-04-13 02:16:59 +00:00
Paul Hauner
88cc222204 Advance state to next slot after importing block (#2174)
## Issue Addressed

NA

## Proposed Changes

Add an optimization to perform `per_slot_processing` from the *leading-edge* of block processing to the *trailing-edge*. Ultimately, this allows us to import the block at slot `n` faster because we used the tail-end of slot `n - 1` to perform `per_slot_processing`.

Additionally, add a "block proposer cache" which allows us to cache the block proposer for some epoch. Since we're now doing trailing-edge `per_slot_processing`, we can prime this cache with the values for the next epoch before those blocks arrive (assuming those blocks don't have some weird forking).

There were several ancillary changes required to achieve this: 

- Remove the `state_root` field  of `BeaconSnapshot`, since there's no need to know it on a `pre_state` and in all other cases we can just read it from `block.state_root()`.
    - This caused some "dust" changes of `snapshot.beacon_state_root` to `snapshot.beacon_state_root()`, where the `BeaconSnapshot::beacon_state_root()` func just reads the state root from the block.
- Rename `types::ShuffingId` to `AttestationShufflingId`. I originally did this because I added a `ProposerShufflingId` struct which turned out to be not so useful. I thought this new name was more descriptive so I kept it.
- Address https://github.com/ethereum/eth2.0-specs/pull/2196
- Add a debug log when we get a block with an unknown parent. There was previously no logging around this case.
- Add a function to `BeaconState` to compute all proposers for an epoch without re-computing the active indices for each slot.

## Additional Info

- ~~Blocked on #2173~~
- ~~Blocked on #2179~~ That PR was wrapped into this PR.
- There's potentially some places where we could avoid computing the proposer indices in `per_block_processing` but I haven't done this here. These would be an optimization beyond the issue at hand (improving block propagation times) and I think this PR is already doing enough. We can come back for that later.

## TODO

- [x] Tidy, improve comments.
- [x] ~~Try avoid computing proposer index in `per_block_processing`?~~
2021-02-15 07:17:52 +00:00
Paul Hauner
ff35fbb121 Add metrics for beacon block propagation (#2173)
## Issue Addressed

NA

## Proposed Changes

Adds some metrics to track delays regarding:

- LH processing of blocks
- delays receiving blocks from other nodes.

## Additional Info

NA
2021-02-04 05:33:56 +00:00
Paul Hauner
2b2a358522 Detailed validator monitoring (#2151)
## Issue Addressed

- Resolves #2064

## Proposed Changes

Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators.

Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint.

Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored.

See the new docs included in this PR for more info.

## TODO

- [x] Track validator balance, `slashed` status, etc.
- [x] ~~Register slashings in current epoch, not offense epoch~~
- [ ] Publish Grafana dashboard, update TODO link in docs
- [x] ~~#2130 is merged into this branch, resolve that~~
2021-01-20 19:19:38 +00:00