Commit Graph

476 Commits

Author SHA1 Message Date
Emilia Hane
ce2db355de
Fix rebase conflict 2023-02-08 11:44:32 +01:00
Michael Sproul
c2f64f8216 Switch allocator to jemalloc (#3697)
## Proposed Changes

Another `tree-states` motivated PR, this adds `jemalloc` as the default allocator, with an option to use the system allocator by compiling with `FEATURES="" make`.

- [x] Metrics
- [x] Test on Windows
- [x] Test on macOS
- [x] Test with `musl`
- [x] Metrics dashboard on `lighthouse-metrics` (https://github.com/sigp/lighthouse-metrics/pull/37)


Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-01-25 14:21:54 +01:00
Mac L
47ade13ef7 Add CLI flag to specify the format of logs written to the logfile (#3839)
## Proposed Changes

Decouple the stdout and logfile formats by adding the `--logfile-format` CLI flag.
This behaves identically to the existing `--log-format` flag, but instead will only affect the logs written to the logfile.
The `--log-format` flag will no longer have any effect on the contents of the logfile.

## Additional Info
This avoids being a breaking change by causing `logfile-format` to default to the value of `--log-format` if it is not provided. 
This means that users who were previously relying on being able to use a JSON formatted logfile will be able to continue to use `--log-format JSON`. 

Users who want to use JSON on stdout and default logs in the logfile, will need to pass the following flags: `--log-format JSON --logfile-format DEFAULT`
2023-01-25 14:21:16 +01:00
realbigsean
06f71e8cce
merge capella 2023-01-12 12:51:09 -05:00
Michael Sproul
2af8110529
Merge remote-tracking branch 'origin/unstable' into capella
Fixing the conflicts involved patching up some of the `block_hash` verification,
the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870
2023-01-12 16:22:00 +11:00
ethDreamer
52c1055fdc
Remove withdrawals-processing feature (#3864)
* Use spec to Determine Supported Engine APIs

* Remove `withdrawals-processing` feature

* Fixed Tests

* Missed Some Spots

* Fixed Another Test

* Stupid Clippy
2023-01-12 15:15:08 +11:00
realbigsean
438126f19a
merge upstream, fix compile errors 2023-01-11 13:52:58 -05:00
Paul Hauner
38514c07f2 Release v3.4.0 (#3862)
## Issue Addressed

NA

## Proposed Changes

Bump versions

## Additional Info

- [x] ~~Blocked on #3728, #3801~~
- [x] ~~Blocked on #3866~~
- [x] Requires additional testing
2023-01-11 03:27:08 +00:00
Paul Hauner
830efdb5c2 Improve validator monitor experience for high validator counts (#3728)
## Issue Addressed

NA

## Proposed Changes

Myself and others (#3678) have observed  that when running with lots of validators (e.g., 1000s) the cardinality is too much for Prometheus. I've seen Prometheus instances just grind to a halt when we turn the validator monitor on for our testnet validators (we have 10,000s of Goerli validators). Additionally, the debug log volume can get very high with one log per validator, per attestation.

To address this, the `bn --validator-monitor-individual-tracking-threshold <INTEGER>` flag has been added to *disable* per-validator (i.e., non-aggregated) metrics/logging once the validator monitor exceeds the threshold of validators. The default value is `64`, which is a finger-to-the-wind value. I don't actually know the value at which Prometheus starts to become overwhelmed, but I've seen it work with ~64 validators and I've seen it *not* work with 1000s of validators. A default of `64` seems like it will result in a breaking change to users who are running millions of dollars worth of validators whilst resulting in a no-op for low-validator-count users. I'm open to changing this number, though.

Additionally, this PR starts collecting aggregated Prometheus metrics (e.g., total count of head hits across all validators), so that high-validator-count validators still have some interesting metrics. We already had logging for aggregated values, so nothing has been added there.

I've opted to make this a breaking change since it can be rather damaging to your Prometheus instance to accidentally enable the validator monitor with large numbers of validators. I've crashed a Prometheus instance myself and had a report from another user who's done the same thing.

## Additional Info

NA

## Breaking Changes Note

A new label has been added to the validator monitor Prometheus metrics: `total`. This label tracks the aggregated metrics of all validators in the validator monitor (as opposed to each validator being tracking individually using its pubkey as the label).

Additionally, a new flag has been added to the Beacon Node: `--validator-monitor-individual-tracking-threshold`. The default value is `64`, which means that when the validator monitor is tracking more than 64 validators then it will stop tracking per-validator metrics and only track the `all_validators` metric. It will also stop logging per-validator logs and only emit aggregated logs (the exception being that exit and slashing logs are always emitted).

These changes were introduced in #3728 to address issues with untenable Prometheus cardinality and log volume when using the validator monitor with high validator counts (e.g., 1000s of validators). Users with less than 65 validators will see no change in behavior (apart from the added `all_validators` metric). Users with more than 65 validators who wish to maintain the previous behavior can set something like `--validator-monitor-individual-tracking-threshold 999999`.
2023-01-09 08:18:55 +00:00
Michael Sproul
87c44697d0
Bump MSRV to 1.65 (#3860) 2023-01-09 14:40:09 +11:00
Michael Sproul
4bd2b777ec Verify execution block hashes during finalized sync (#3794)
## Issue Addressed

Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible.

## Proposed Changes

I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client.

I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library).

Block hash verification is pretty quick, about 500us per block on my machine (mainnet).

The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL.

## Additional Info

This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it.

I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.
2023-01-09 03:11:59 +00:00
realbigsean
f45d117e73
merge with capella 2022-12-23 10:21:18 -05:00
Mark Mackey
b75ca74222 Removed withdrawals feature flag 2022-12-19 15:38:46 -06:00
realbigsean
d893706e0e
merge with capella 2022-12-15 09:33:18 -05:00
Pawan Dhananjay
52a06231c8
Add support for compile time FIELD_ELEMENTS_PER_BLOB 2022-12-14 21:25:47 +05:30
Michael Sproul
991e4094f8
Merge remote-tracking branch 'origin/unstable' into capella-update 2022-12-14 13:00:41 +11:00
Michael Sproul
775d222299 Enable proposer boost re-orging (#2860)
## Proposed Changes

With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004

There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034

Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: pawan <pawandhananjay@gmail.com>
2022-12-13 09:57:26 +00:00
Mark Mackey
8a04c3428e Merged with unstable 2022-11-30 17:29:10 -06:00
Mac L
c881b80367 Add CLI flag for gui requirements (#3731)
## Issue Addressed

#3723

## Proposed Changes

Adds a new CLI flag `--gui` which enables all the various flags required for the gui to function properly.
Currently enables the `--http` and `--validator-monitor-auto` flags.
2022-11-28 00:22:53 +00:00
Mac L
969ff240cd Add CLI flag to opt in to world-readable log files (#3747)
## Issue Addressed

#3732

## Proposed Changes

Add a CLI flag to allow users to opt out of the restrictive permissions of the log files.

## Additional Info

This is not recommended for most users. The log files can contain sensitive information such as validator indices, public keys and API tokens (see #2438). However some users using a multi-user setup may find this helpful if they understand the risks involved.
2022-11-25 07:57:11 +00:00
Giulio rebuffo
d5a2de759b Added LightClientBootstrap V1 (#3711)
## Issue Addressed

Partially addresses #3651

## Proposed Changes

Adds server-side support for light_client_bootstrap_v1 topic

## Additional Info

This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.


Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
2022-11-25 05:19:00 +00:00
Paul Hauner
bf533c8e42 v3.3.0 (#3741)
## Issue Addressed

NA

## Proposed Changes

- Bump versions
- Pin the `nethermind` version since our method of getting the latest tags on `master` is giving us an old version (`1.14.1`).
- Increase timeout for execution engine startup.

## Additional Info

- [x] ~Awaiting further testing~
2022-11-23 23:38:32 +00:00
Akihito Nakano
8a36acdb1a Super small improvement: Remove unnecessary mut (#3736)
## Issue Addressed

<!--Which issue # does this PR address?-->

Removed some unnecessary `mut`. 🙂 

<!--
## Proposed Changes

Please list or describe the changes introduced by this PR.
-->

<!--
## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.
-->
2022-11-21 03:15:54 +00:00
Age Manning
230168deff Health Endpoints for UI (#3668)
This PR adds some health endpoints for the beacon node and the validator client.

Specifically it adds the endpoint:
`/lighthouse/ui/health`

These are not entirely stable yet. But provide a base for modification for our UI. 

These also may have issues with various platforms and may need modification.
2022-11-15 05:21:26 +00:00
Michael Sproul
3be41006a6 Add --light-client-server flag and state cache utils (#3714)
## Issue Addressed

Part of https://github.com/sigp/lighthouse/issues/3651.

## Proposed Changes

Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. https://github.com/sigp/lighthouse/pull/3693, https://github.com/sigp/lighthouse/pull/3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`.

Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. https://github.com/sigp/lighthouse/pull/3206).

## Additional Info

I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
2022-11-11 11:03:18 +00:00
GeemoCandama
c591fcd201 add checkpoint-sync-url-timeout flag (#3710)
## Issue Addressed
#3702 
Which issue # does this PR address?
#3702
## Proposed Changes
Added checkpoint-sync-url-timeout flag to cli. Added timeout field to ClientGenesis::CheckpointSyncUrl to utilize timeout set

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.


Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
2022-11-11 00:38:28 +00:00
realbigsean
fc0b06a039
Feature gate withdrawals (#3684)
* start feature gating

* feature gate withdrawals
2022-11-04 16:50:26 -04:00
Divma
46fbf5b98b Update discv5 (#3171)
## Issue Addressed

Updates discv5

Pending on
- [x] #3547 
- [x] Alex upgrades his deps

## Proposed Changes

updates discv5 and the enr crate. The only relevant change would be some clear indications of ipv4 usage in lighthouse

## Additional Info

Functionally, this should be equivalent to the prev version.
As draft pending a discv5 release
2022-10-28 05:40:06 +00:00
Michael Sproul
6d5a2b509f Release v3.2.1 (#3660)
## Proposed Changes

Patch release to include the performance regression fix https://github.com/sigp/lighthouse/pull/3658.

## Additional Info

~~Blocked on the merge of https://github.com/sigp/lighthouse/pull/3658.~~
2022-10-26 09:38:25 +00:00
Paul Hauner
fcfd02aeec Release v3.2.0 (#3647)
## Issue Addressed

NA

## Proposed Changes

Bump version to `v3.2.0`

## Additional Info

- ~~Blocked on #3597~~
- ~~Blocked on #3645~~
- ~~Blocked on #3653~~
- ~~Requires additional testing~~
2022-10-25 06:36:51 +00:00
pinkiebell
d0efb6b18a beacon_node: add --disable-deposit-contract-sync flag (#3597)
Overrides any previous option that enables the eth1 service.
Useful for operating a `light` beacon node.

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2022-10-19 22:55:49 +00:00
GeemoCandama
c5cd0d9b3f add execution-timeout-multiplier flag to optionally increase timeouts (#3631)
## Issue Addressed
Add flag to lengthen execution layer timeouts

Which issue # does this PR address?

#3607 

## Proposed Changes

Added execution-timeout-multiplier flag and a cli test to ensure the execution layer config has the multiplier set correctly.

Please list or describe the changes introduced by this PR.
Add execution_timeout_multiplier to the execution layer config as Option<u32> and pass the u32 to HttpJsonRpc.

## Additional Info
Not certain that this is the best way to implement it so I'd appreciate any feedback.

Please provide any additional information. For example, future considerations
or information useful for reviewers.
2022-10-18 04:02:07 +00:00
mariuspod
242ae21e5d Pass EL JWT secret key via cli flag (#3568)
## Proposed Changes

In this change I've added a new beacon_node cli flag `--execution-jwt-secret-key` for passing the JWT secret directly as string.

Without this flag, it was non-trivial to pass a secrets file containing a JWT secret key without compromising its contents into some management repo or fiddling around with manual file mounts for cloud-based deployments.

When used in combination with environment variables, the secret can be injected into container-based systems like docker & friends quite easily.

It's both possible to either specify the file_path to the JWT secret or pass the JWT secret directly.

I've modified the docs and attached a test as well.

## Additional Info

The logic has been adapted a bit so that either one of `--execution-jwt` or `--execution-jwt-secret-key` must be set when specifying `--execution-endpoint` so that it's still compatible with the semantics before this change and there's at least one secret provided.
2022-10-04 12:41:03 +00:00
GeemoCandama
6a92bf70e4 CLI tests for logging flags (#3609)
## Issue Addressed
Adding CLI tests for logging flags: log-color and disable-log-timestamp
Which issue # does this PR address?
#3588 
## Proposed Changes
Add CLI tests for logging flags as described in #3588 
Please list or describe the changes introduced by this PR.
Added logger_config to client::Config as suggested. Implemented Default for LoggerConfig based on what was being done elsewhere in the repo. Created 2 tests for each flag addressed.
## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.
2022-10-04 08:33:40 +00:00
Pawan Dhananjay
8728c40102 Remove fallback support from eth1 service (#3594)
## Issue Addressed

N/A

## Proposed Changes

With https://github.com/sigp/lighthouse/pull/3214 we made it such that you can either have 1 auth endpoint or multiple non auth endpoints. Now that we are post merge on all networks (testnets and mainnet), we cannot progress a chain without a dedicated auth execution layer connection so there is no point in having a non-auth eth1-endpoint for syncing deposit cache. 

This code removes all fallback related code in the eth1 service. We still keep the single non-auth endpoint since it's useful for testing.

## Additional Info

This removes all eth1 fallback related metrics that were relevant for the monitoring service, so we might need to change the api upstream.
2022-10-04 08:33:39 +00:00
Pawan Dhananjay
6779912fe4 Publish subscriptions to all beacon nodes (#3529)
## Issue Addressed

Resolves #3516 

## Proposed Changes

Adds a beacon fallback function for running a beacon node http query on all available fallbacks instead of returning on a first successful result. Uses the new `run_on_all` method for attestation and sync committee subscriptions. 

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.
2022-09-28 19:53:35 +00:00
Paul Hauner
01e84b71f5 v3.1.2 (#3603)
## Issue Addressed

NA

## Proposed Changes

Bump versions to v3.1.2

## Additional Info

- ~~Blocked on several PRs.~~
- ~~Requires further testing.~~
2022-09-26 01:17:36 +00:00
Ramana Kumar
76ba0a1aaf Add disable-log-timestamp flag (#3101) (#3586)
## Issues Addressed

Closes https://github.com/sigp/lighthouse/issues/3101

## Proposed Changes

Add global flag to suppress timestamps in the terminal logger.
2022-09-23 03:52:41 +00:00
Paul Hauner
3128b5b430 v3.1.1 (#3585)
## Issue Addressed

NA

## Proposed Changes

Bump versions

## Additional Info

- ~~Requires additional testing~~
- ~~Blocked on:~~
    - ~~#3589~~
    - ~~#3540~~
    - ~~#3587~~
2022-09-22 06:08:52 +00:00
Michael Sproul
507bb9dad4 Refined payload pruning (#3587)
## Proposed Changes

Improve the payload pruning feature in several ways:

- Payload pruning is now entirely optional. It is enabled by default but can be disabled with `--prune-payloads false`. The previous `--prune-payloads-on-startup` flag from #3565 is removed.
- Initial payload pruning on startup now runs in a background thread. This thread will always load the split state, which is a small fraction of its total work (up to ~300ms) and then backtrack from that state. This pruning process ran in 2m5s on one Prater node with good I/O and 16m on a node with slower I/O.
- To work with the optional payload pruning the database function `try_load_full_block` will now attempt to load execution payloads for finalized slots _if_ pruning is currently disabled. This gives users an opt-out for the extensive traffic between the CL and EL for reconstructing payloads.

## Additional Info

If the `prune-payloads` flag is toggled on and off then the on-startup check may not see any payloads to delete and fail to clean them up. In this case the `lighthouse db prune_payloads` command should be used to force a manual sweep of the database.
2022-09-19 07:58:49 +00:00
Michael Sproul
ca42ef2e5a Prune finalized execution payloads (#3565)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/3556

## Proposed Changes

Delete finalized execution payloads from the database in two places:

1. When running the finalization migration in `migrate_database`. We delete the finalized payloads between the last split point and the new updated split point. _If_ payloads are already pruned prior to this then this is sufficient to prune _all_ payloads as non-canonical payloads are already deleted by the head pruner, and all canonical payloads prior to the previous split will already have been pruned.
2. To address the fact that users will update to this code _after_ the merge on mainnet (and testnets), we need a one-off scan to delete the finalized payloads from the canonical chain. This is implemented in `try_prune_execution_payloads` which runs on startup and scans the chain back to the Bellatrix fork or the anchor slot (if checkpoint synced after Bellatrix). In the case where payloads are already pruned this check only imposes a single state load for the split state, which shouldn't be _too slow_. Even so, a flag `--prepare-payloads-on-startup=false` is provided to turn this off after it has run the first time, which provides faster start-up times.

There is also a new `lighthouse db prune_payloads` subcommand for users who prefer to run the pruning manually.

## Additional Info

The tests have been updated to not rely on finalized payloads in the database, instead using the `MockExecutionLayer` to reconstruct them. Additionally a check was added to `check_chain_dump` which asserts the non-existence or existence of payloads on disk depending on their slot.
2022-09-17 02:27:01 +00:00
realbigsean
a9f075c3c0 Remove strict fee recipient (#3552)
## Issue Addressed

Resolves: #3550

Remove the `--strict-fee-recipient` flag. It will cause missed proposals prior to the bellatrix transition.

Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-09-08 23:46:02 +00:00
Alexander Cyon
419c53bf24 Add flag 'log-color' preserving color of log redirected to file. (#3538)
Add flag 'log-color' which preserves colors of log when stdout is redirected to a file.

This is my first lighthouse PR, please let me know if I'm not following contribution guidelines, I welcome meta-feeback (feedback on git commit messages, git branch naming, and the contents of the description of this PR.)

## Issue Addressed

Solves https://github.com/sigp/lighthouse/issues/3527

## Proposed Changes

Adding a flag which enables log color preserving when stdout is redirected to a file.

### Usage
Below I demonstrate current behaviour (without using the new flag) and the new behaviur (when using new flag).

In the screenshot below, I have to panes, one on top running `lighthouse` which redirects to file `~/Desktop/test.log` and one pane in the bottom which runs `tail -f ~/Desktop/test.log`.

#### Current behaviour
```sh
lighthouse --network prater vc |& tee -a ~/Desktop/test.log
```

**Result is no colors**

<img width="1624" alt="current" src="https://user-images.githubusercontent.com/864410/188258226-bfcf8271-4c9e-474c-848e-ac92a60df25c.png">


#### New behaviour
```sh
lighthouse --network prater vc --log-color |& tee -a ~/Desktop/test.log
```

**Result is colors** 🔴🟢🔵🟡

<img width="1624" alt="new" src="https://user-images.githubusercontent.com/864410/188258223-7d9ecf09-92c8-4cba-8f24-bd4d88fc0353.png">

## Additional Info

I chose American spelling of "color" instead of Brittish "colour' since that was aligned with `slog`'s API - method`force_color()`, let me know if you prefer spelling "colour" instead. I also chose to let it be an arg not taking any argument, just like `logfile-compress` flag, rather than having to write `--log-color true`.
2022-09-06 05:58:27 +00:00
Michael Sproul
9a7f7f1c1e Configurable monitoring endpoint frequency (#3530)
## Issue Addressed

Closes #3514

## Proposed Changes

- Change default monitoring endpoint frequency to 120 seconds to fit with 30k requests/month limit.
- Allow configuration of the monitoring endpoint frequency using `--monitoring-endpoint-frequency N` where `N` is a value in seconds.
2022-09-05 08:29:00 +00:00
realbigsean
177aef8f1e Builder profit threshold flag (#3534)
## Issue Addressed

Resolves https://github.com/sigp/lighthouse/issues/3517

## Proposed Changes

Adds a `--builder-profit-threshold <wei value>` flag to the BN. If an external payload's value field is less than this value, the local payload will be used. The value of the local payload will not be checked (it can't really be checked until the engine API is updated to support this).


Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-09-05 04:50:49 +00:00
realbigsean
cae40731a2 Strict count unrealized (#3522)
## Issue Addressed

Add a flag that can increase count unrealized strictness, defaults to false

## Proposed Changes

Please list or describe the changes introduced by this PR.

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: sean <seananderson33@gmail.com>
2022-09-05 04:50:47 +00:00
Paul Hauner
aa022f4685 v3.1.0 (#3525)
## Issue Addressed

NA

## Proposed Changes

- Bump versions

## Additional Info

- ~~Blocked on #3508~~
- ~~Blocked on #3526~~
- ~~Requires additional testing.~~
- Expected release date is 2022-09-01
2022-08-31 22:21:55 +00:00
Paul Hauner
8609cced0e Reset payload statuses when resuming fork choice (#3498)
## Issue Addressed

NA

## Proposed Changes

This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot.

The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around.

## Additional Info

This implementation has the following intricacies:

1. Instead of just resetting *invalid* payloads to optimistic, we'll also reset *valid* payloads. This is an artifact of our existing implementation.
1. We will only reset payload statuses when we detect an invalid payload present in `proto_array`
    - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks.
1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that *does not* have reverted payload statuses.
    - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.
2022-08-29 14:34:41 +00:00
Michael Sproul
66eca1a882 Refactor op pool for speed and correctness (#3312)
## Proposed Changes

This PR has two aims: to speed up attestation packing in the op pool, and to fix bugs in the verification of attester slashings, proposer slashings and voluntary exits. The changes are bundled into a single database schema upgrade (v12).

Attestation packing is sped up by removing several inefficiencies: 

- No more recalculation of `attesting_indices` during packing.
- No (unnecessary) examination of the `ParticipationFlags`: a bitfield suffices. See `RewardCache`.
- No re-checking of attestation validity during packing: the `AttestationMap` provides attestations which are "correct by construction" (I have checked this using Hydra).
- No SSZ re-serialization for the clunky `AttestationId` type (it can be removed in a future release).

So far the speed-up seems to be roughly 2-10x, from 500ms down to 50-100ms.

Verification of attester slashings, proposer slashings and voluntary exits is fixed by:

- Tracking the `ForkVersion`s that were used to verify each message inside the `SigVerifiedOp`. This allows us to quickly re-verify that they match the head state's opinion of what the `ForkVersion` should be at the epoch(s) relevant to the message.
- Storing the `SigVerifiedOp` on disk rather than the raw operation. This allows us to continue track the fork versions after a reboot.

This is mostly contained in this commit 52bb1840ae5c4356a8fc3a51e5df23ed65ed2c7f.

## Additional Info

The schema upgrade uses the justified state to re-verify attestations and compute `attesting_indices` for them. It will drop any attestations that fail to verify, by the logic that attestations are most valuable in the few slots after they're observed, and are probably stale and useless by the time a node restarts. Exits and proposer slashings and similarly re-verified to obtain `SigVerifiedOp`s.

This PR contains a runtime killswitch `--paranoid-block-proposal` which opts out of all the optimisations in favour of closely verifying every included message. Although I'm quite sure that the optimisations are correct this flag could be useful in the event of an unforeseen emergency.

Finally, you might notice that the `RewardCache` appears quite useless in its current form because it is only updated on the hot-path immediately before proposal. My hope is that in future we can shift calls to `RewardCache::update` into the background, e.g. while performing the state advance. It is also forward-looking to `tree-states` compatibility, where iterating and indexing `state.{previous,current}_epoch_participation` is expensive and needs to be minimised.
2022-08-29 09:10:26 +00:00
Michael Sproul
aab4a8d2f2 Update docs for mainnet merge release (#3494)
## Proposed Changes

Update the merge migration docs to encourage updating mainnet configs _now_!

The docs are also updated to recommend _against_ `--suggested-fee-recipient` on the beacon node (https://github.com/sigp/lighthouse/issues/3432).

Additionally the `--help` for the CLI is updated to match with a few small semantic changes:

- `--execution-jwt` is no longer allowed without `--execution-endpoint`. We've ended up without a default for `--execution-endpoint`, so I think that's fine.
- The flags related to the JWT are only allowed if `--execution-jwt` is provided.
2022-08-23 03:50:58 +00:00