## Issue Addressed
#3154
## Proposed Changes
Add three new metrics for the VC:
1. `vc_beacon_nodes_synced_count`
2. `vc_beacon_nodes_available_count`
3. `vc_beacon_nodes_total_count`
Their values mirror the values present in the following log line:
```
Apr 08 17:25:17.000 INFO Connected to beacon node(s) synced: 4, available: 4, total: 4, service: notifier
```
## Issue Addressed
This fixes the low-hanging Clippy lints introduced in Rust 1.61 (due any hour now). It _ignores_ one lint, because fixing it requires a structural refactor of the validator client that needs to be done delicately. I've started on that refactor and will create another PR that can be reviewed in more depth in the coming days. I think we should merge this PR in the meantime to unblock CI.
## Issue Addressed
Web3Signer validators do not support client authentication. This means the `--tls-known-clients-file` option on Web3Signer can't be used with Lighthouse.
## Proposed Changes
Add two new fields to Web3Signer validators, `client_identity_path` and `client_identity_password`, which specify the path and password for a PKCS12 file containing a certificate and private key. If `client_identity_path` is present, use the certificate for SSL client authentication.
## Additional Info
I am successfully validating on Prater using client authentication with Web3Signer and client authentication.
## Issue Addressed
#3141
## Proposed Changes
Changes the algorithm for proposing blocks from
```
For each BN (first success):
- Produce a block
- Sign the block and store its root in the slashing protection DB
- Publish the block
```
to
```
For each BN (first success):
- Produce a block
Sign the block and store its root in the slashing protection DB
For each BN (first success):
- Publish the block
```
Separating the producing from the publishing makes sure that we only add a signed block once to the slashing DB.
# Description
Since the `TaskExecutor` currently requires a `Weak<Runtime>`, it's impossible to use it in an async test where the `Runtime` is created outside our scope. Whilst we *could* create a new `Runtime` instance inside the async test, dropping that `Runtime` would cause a panic (you can't drop a `Runtime` in an async context).
To address this issue, this PR creates the `enum Handle`, which supports either:
- A `Weak<Runtime>` (for use in our production code)
- A `Handle` to a runtime (for use in testing)
In theory, there should be no change to the behaviour of our production code (beyond some slightly different descriptions in HTTP 500 errors), or even our tests. If there is no change, you might ask *"why bother?"*. There are two PRs (#3070 and #3175) that are waiting on these fixes to introduce some new tests. Since we've added the EL to the `BeaconChain` (for the merge), we are now doing more async stuff in tests.
I've also added a `RuntimeExecutor` to the `BeaconChainTestHarness`. Whilst that's not immediately useful, it will become useful in the near future with all the new async testing.
Code simplifications using `Option`/`Result` combinators to make pattern-matches a tad simpler.
Opinions on these loosely held, happy to adjust in review.
Tool-aided by [comby-rust](https://github.com/huitseeker/comby-rust).
## Issue Addressed
#3068
## Proposed Changes
Adds support for remote key API.
## Additional Info
Needed to add `is_local_keystore` argument to `delete_definition_and_keystore` to know if we want to delete local or remote key. Previously this wasn't necessary because remotekeys(web3signers) could be deleted.
## Issue Addressed
MEV boost compatibility
## Proposed Changes
See #2987
## Additional Info
This is blocked on the stabilization of a couple specs, [here](https://github.com/ethereum/beacon-APIs/pull/194) and [here](https://github.com/flashbots/mev-boost/pull/20).
Additional TODO's and outstanding questions
- [ ] MEV boost JWT Auth
- [ ] Will `builder_proposeBlindedBlock` return the revealed payload for the BN to propogate
- [ ] Should we remove `private-tx-proposals` flag and communicate BN <> VC with blinded blocks by default once these endpoints enter the beacon-API's repo? This simplifies merge transition logic.
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: realbigsean <sean@sigmaprime.io>
## Issue Addressed
#3103
## Proposed Changes
Parse `http-address` and `metrics-address` as `IpAddr` for both the beacon node and validator client to support IPv6 addresses.
Also adjusts parsing of CORS origins to allow for IPv6 addresses.
## Usage
You can now set `http-address` and/or `metrics-address` flags to IPv6 addresses.
For example, the following:
`lighthouse bn --http --http-address :: --metrics --metrics-address ::1`
will expose the beacon node HTTP server on `[::]` (equivalent of `0.0.0.0` in IPv4) and the metrics HTTP server on `localhost` (the equivalent of `127.0.0.1` in IPv4)
The beacon node API can then be accessed by:
`curl "http://[server-ipv6-address]:5052/eth/v1/some_endpoint"`
And the metrics server api can be accessed by:
`curl "http://localhost:5054/metrics"` or by `curl "http://[::1]:5054/metrics"`
## Additional Info
On most Linux distributions the `v6only` flag is set to `false` by default (see the section for the `IPV6_V6ONLY` flag in https://www.man7.org/linux/man-pages/man7/ipv6.7.html) which means IPv4 connections will continue to function on a IPv6 address (providing it is appropriately mapped). This means that even if the Lighthouse API is running on `::` it is also possible to accept IPv4 connections.
However on Windows, this is not the case. The `v6only` flag is set to `true` so binding to `::` will only allow IPv6 connections.
## Issue Addressed
Presently if the VC is configured with a fee recipient it will error out when sending fee-recipient preparations to a beacon node that doesn't yet support the API:
```
Mar 08 22:23:36.236 ERRO Unable to publish proposer preparation error: All endpoints failed https://eth2-beacon-prater.infura.io/ => RequestFailed(StatusCode(404)), service: preparation
```
This doesn't affect other VC duties, but could be a source of anxiety for users trying to do the right thing and configure their fee recipients in advance.
## Proposed Changes
Change the preparation service to only send preparations if the current slot is later than 2 epochs before the Bellatrix hard fork epoch.
## Additional Info
I've tagged this v2.1.4 as I think it's a small change that's worth having for the next release
## Proposed Changes
Lots of lint updates related to `flat_map`, `unwrap_or_else` and string patterns. I did a little more creative refactoring in the op pool, but otherwise followed Clippy's suggestions.
## Additional Info
We need this PR to unblock CI.
## Issue Addressed
#3020
## Proposed Changes
- Alias the `validators-dir` arg to `validator-dir` in the `validator_client` subcommand.
- Alias the `validator-dir` arg to `validators-dir` in the `account_manager validator` subcommand.
- Add test for the validator_client alias.
## Issue Addressed
Addresses https://github.com/sigp/lighthouse/issues/2926
## Proposed Changes
Appropriated from https://github.com/sigp/lighthouse/issues/2926#issuecomment-1039676768:
When a node returns *any* error we call [`CandidateBeaconNode::set_offline`](c3a793fd73/validator_client/src/beacon_node_fallback.rs (L424)) which sets it's `status` to `CandidateError::Offline`. That node will then be ignored until the routine [`fallback_updater_service`](c3a793fd73/validator_client/src/beacon_node_fallback.rs (L44)) manages to reconnect to it.
However, I believe there was an issue in the [`CanidateBeaconNode::refesh_status`](c3a793fd73/validator_client/src/beacon_node_fallback.rs (L157-L178)) method, which is used by the updater service to see if the node has come good again. It was holding a [write lock on the `status` field](c3a793fd73/validator_client/src/beacon_node_fallback.rs (L165)) whilst it polled the node status. This means a long timeout would hog the write lock and starve other processes.
When a VC is trying to access a beacon node for whatever purpose (getting duties, posting blocks, etc), it performs [three passes](c3a793fd73/validator_client/src/beacon_node_fallback.rs (L432-L482)) through the lists of nodes, trying to run some generic `function` (closure, lambda, etc) on each node:
- 1st pass: only try running `function` on all nodes which are both synced and online.
- 2nd pass: try running `function` on all nodes that are online, but not necessarily synced.
- 3rd pass: for each offline node, try refreshing its status and then running `function` on it.
So, it turns out that if the `CanidateBeaconNode::refesh_status` function from the routine update service is hogging the write-lock, the 1st pass gets blocked whilst trying to read the status of the first node. So, nodes that should be left until the 3rd pass are blocking the process of the 1st and 2nd passes, hence the behaviour described in #2926.
## Additional Info
NA
## Issue Addressed
#2953
## Proposed Changes
Adds empty local validator check.
## Additional Info
Two other options:
- add check inside `local_index` collection. Instead of after collection.
- Move `local_index` collection to the beginning of the `poll_sync_committee_duties` function and combine sync committee with altair fork check.
## Issue Addressed
#2883
## Proposed Changes
* Added `suggested-fee-recipient` & `suggested-fee-recipient-file` flags to validator client (similar to graffiti / graffiti-file implementation).
* Added proposer preparation service to VC, which sends the fee-recipient of all known validators to the BN via [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api once per slot
* Added [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api endpoint and preparation data caching
* Added cleanup routine to remove cached proposer preparations when not updated for 2 epochs
## Additional Info
Changed the Implementation following the discussion in #2883.
Co-authored-by: pk910 <philipp@pk910.de>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: Philipp K <philipp@pk910.de>
## Issue Addressed
Implements the standard key manager API from https://ethereum.github.io/keymanager-APIs/, formerly https://github.com/ethereum/beacon-APIs/pull/151
Related to https://github.com/sigp/lighthouse/issues/2557
## Proposed Changes
- [x] Add all of the new endpoints from the standard API: GET, POST and DELETE.
- [x] Add a `validators.enabled` column to the slashing protection database to support atomic disable + export.
- [x] Add tests for all the common sequential accesses of the API
- [x] Add tests for interactions with remote signer validators
- [x] Add end-to-end tests for migration of validators from one VC to another
- [x] Implement the authentication scheme from the standard (token bearer auth)
## Additional Info
The `enabled` column in the validators SQL database is necessary to prevent a race condition when exporting slashing protection data. Without the slashing protection database having a way of knowing that a key has been disabled, a concurrent request to sign a message could insert a new record into the database. The `delete_concurrent_with_signing` test exercises this code path, and was indeed failing before the `enabled` column was added.
The validator client authentication has been modified from basic auth to bearer auth, with basic auth preserved for backwards compatibility.
## Proposed Changes
Remove the check for exact equality on the beacon node spec when polling `/config/spec` from the VC. This check was always overzealous, and mostly served to check that the BN was configured for upcoming forks. I've replaced it by explicit checks of the `altair_fork_epoch` and `bellatrix_fork_epoch` instead.
## Additional Info
We should come back to this and clean it up so that we can retain compatibility while removing the field `default`s we installed.
## Issue Addressed
New rust lints
## Proposed Changes
- Boxing some enum variants
- removing some unused fields (is the validator lockfile unused? seemed so to me)
## Additional Info
- some error fields were marked as dead code but are logged out in areas
- left some dead fields in our ef test code because I assume they are useful for debugging?
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
Resolves#2612
## Proposed Changes
Implements both the checks mentioned in the original issue.
1. Verifies the `randao_reveal` in the beacon node
2. Cross checks the proposer index after getting back the block from the beacon node.
## Additional info
The block production time increases by ~10x because of the signature verification on the beacon node (based on the `beacon_block_production_process_seconds` metric) when running on a local testnet.
## Issue Addressed
Resolves#2262
## Proposed Changes
Add a new CLI flag `--beacon-nodes-tls-certs` which allows the user to specify a path to a certificate file (or a list of files, separated by commas). The VC will then use these certificates (in addition to the existing certificates in the OS trust store) when connecting to a beacon node over HTTPS.
## Additional Info
This only supports certificates in PEM format.
## Proposed Changes
Instead of checking for strict equality between a BN's spec and the VC's local spec, just check the genesis fork version. This prevents us from failing eagerly for minor differences, while still protecting the VC from connecting to a completely incompatible BN.
A warning is retained for the previous case where the specs are not exactly equal, which is to be expected if e.g. running against Infura before Infura configures the mainnet Altair fork epoch.
[EIP-3030]: https://eips.ethereum.org/EIPS/eip-3030
[Web3Signer]: https://consensys.github.io/web3signer/web3signer-eth2.html
## Issue Addressed
Resolves#2498
## Proposed Changes
Allows the VC to call out to a [Web3Signer] remote signer to obtain signatures.
## Additional Info
### Making Signing Functions `async`
To allow remote signing, I needed to make all the signing functions `async`. This caused a bit of noise where I had to convert iterators into `for` loops.
In `duties_service.rs` there was a particularly tricky case where we couldn't hold a write-lock across an `await`, so I had to first take a read-lock, then grab a write-lock.
### Move Signing from Core Executor
Whilst implementing this feature, I noticed that we signing was happening on the core tokio executor. I suspect this was causing the executor to temporarily lock and occasionally trigger some HTTP timeouts (and potentially SQL pool timeouts, but I can't verify this). Since moving all signing into blocking tokio tasks, I noticed a distinct drop in the "atttestations_http_get" metric on a Prater node:
![http_get_times](https://user-images.githubusercontent.com/6660660/132143737-82fd3836-2e7e-445b-a143-cb347783baad.png)
I think this graph indicates that freeing the core executor allows the VC to operate more smoothly.
### Refactor TaskExecutor
I noticed that the `TaskExecutor::spawn_blocking_handle` function would fail to spawn tasks if it were unable to obtain handles to some metrics (this can happen if the same metric is defined twice). It seemed that a more sensible approach would be to keep spawning tasks, but without metrics. To that end, I refactored the function so that it would still function without metrics. There are no other changes made.
## TODO
- [x] Restructure to support multiple signing methods.
- [x] Add calls to remote signer from VC.
- [x] Documentation
- [x] Test all endpoints
- [x] Test HTTPS certificate
- [x] Allow adding remote signer validators via the API
- [x] Add Altair support via [21.8.1-rc1](https://github.com/ConsenSys/web3signer/releases/tag/21.8.1-rc1)
- [x] Create issue to start using latest version of web3signer. (See #2570)
## Notes
- ~~Web3Signer doesn't yet support the Altair fork for Prater. See https://github.com/ConsenSys/web3signer/issues/423.~~
- ~~There is not yet a release of Web3Signer which supports Altair blocks. See https://github.com/ConsenSys/web3signer/issues/391.~~
## Issue Addressed
Resolves#2438Resolves#2437
## Proposed Changes
Changes the permissions for validator client http server api token file and secret key to 600 from 644. Also changes the permission for logfiles generated using the `--logfile` cli option to 600.
Logs the path to the api token instead of the actual api token. Updates docs to reflect the change.
## Issue Addressed
Related to: #2259
Made an attempt at all the necessary updates here to publish the crates to crates.io. I incremented the minor versions on all the crates that have been previously published. We still might run into some issues as we try to publish because I'm not able to test this out but I think it's a good starting point.
## Proposed Changes
- Add description and license to `ssz_types` and `serde_util`
- rename `serde_util` to `eth2_serde_util`
- increment minor versions
- remove path dependencies
- remove patch dependencies
## Additional Info
Crates published:
- [x] `tree_hash` -- need to publish `tree_hash_derive` and `eth2_hashing` first
- [x] `eth2_ssz_types` -- need to publish `eth2_serde_util` first
- [x] `tree_hash_derive`
- [x] `eth2_ssz`
- [x] `eth2_ssz_derive`
- [x] `eth2_serde_util`
- [x] `eth2_hashing`
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
NA
## Proposed Changes
- Adds docs for Doppelganger Protection
- Shortens a log message since it was a bit longer than our usual formatting.
## Additional Info
Please provide any additional information. For example, future considerations
or information useful for reviewers.
## Issue Addressed
NA
## Proposed Changes
Registers validators with the doppelganger service at the earliest possible point.
This avoids the following (non-harmful, but scary) log when pruning the slashing DB on startup:
```
CRIT Validator unknown to doppelganger service, pubkey: 0xabc..., msg: preventing validator from performing duties, service: doppelganger
```
## Additional Info
NA
## Proposed Changes
* Implement the validator client and HTTP API changes necessary to support Altair
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
N/A
## Proposed Changes
This is just a cosmetic change to print only the unique list of violaters. We could repeat violaters in the list if an attestation and aggregation both were detected from the same validator.
## Issue Addressed
NA
## Proposed Changes
Fixes a bug in Doppelganger Protection which would cause false-positives when a VC is restarted in the same epoch where it has already produced a signed message.
It could also cause a false-negative in the scenario where time skips forward (perhaps due to host suspend/wake). The new `time_skips_forward_with_doppelgangers` test covers this case.
This was a simple (and embarrassing, on my behalf) `>=` instead of `<=` bug that was missed by my tests but detected during manual testing by @michaelsproul (🙏). Regression tests have been added.
## Additional Info
NA
## TODO
- [x] Add test for doppelganger in epoch > next_check_epoch
## Issue Addressed
Resolves#2069
## Proposed Changes
- Adds a `--doppelganger-detection` flag
- Adds a `lighthouse/seen_validators` endpoint, which will make it so the lighthouse VC is not interopable with other client beacon nodes if the `--doppelganger-detection` flag is used, but hopefully this will become standardized. Relevant Eth2 API repo issue: https://github.com/ethereum/eth2.0-APIs/issues/64
- If the `--doppelganger-detection` flag is used, the VC will wait until the beacon node is synced, and then wait an additional 2 epochs. The reason for this is to make sure the beacon node is able to subscribe to the subnets our validators should be attesting on. I think an alternative would be to have the beacon node subscribe to all subnets for 2+ epochs on startup by default.
## Additional Info
I'd like to add tests and would appreciate feedback.
TODO: handle validators started via the API, potentially make this default behavior
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
N/A
## Proposed Changes
- Removing a bunch of unnecessary references
- Updated `Error::VariantError` to `Error::Variant`
- There were additional enum variant lints that I ignored, because I thought our variant names were fine
- removed `MonitoredValidator`'s `pubkey` field, because I couldn't find it used anywhere. It looks like we just use the string version of the pubkey (the `id` field) if there is no index
## Additional Info
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
#2454
## Proposed Changes
Adds the `--http-address` flag to allow the user to use custom HTTP addresses. This can be helpful for certain Docker setups.
Since using custom HTTP addresses is unsafe due to the server being unencrypted, `--unencrypted-http-transport` was also added as a safety flag and must be used in tandem with `--http-address`. This is to ensure the user is aware of the risks associated with using non-local HTTP addresses.
## Issue Addressed
Resolves#2313
## Proposed Changes
Provide `BeaconNodeHttpClient` with a dedicated `Timeouts` struct.
This will allow granular adjustment of the timeout duration for different calls made from the VC to the BN. These can either be a constant value, or as a ratio of the slot duration.
Improve timeout performance by using these adjusted timeout duration's only whenever a fallback endpoint is available.
Add a CLI flag called `use-long-timeouts` to revert to the old behavior.
## Additional Info
Additionally set the default `BeaconNodeHttpClient` timeouts to the be the slot duration of the network, rather than a constant 12 seconds. This will allow it to adjust to different network specifications.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Proposed Changes
Implement the consensus changes necessary for the upcoming Altair hard fork.
## Additional Info
This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`.
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
`make lint` failing on rust 1.53.0.
## Proposed Changes
1.53.0 updates
## Additional Info
I haven't figure out why yet, we were now hitting the recursion limit in a few crates. So I had to add `#![recursion_limit = "256"]` in a few places
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
Windows incompatibility.
## Proposed Changes
On windows, lighthouse needs to default to STDIN as tty doesn't exist. Also Windows uses ACLs for file permissions. So to mirror chmod 600, we will remove every entry in a file's ACL and add only a single SID that is an alias for the file owner.
Beyond that, there were several changes made to different unit tests because windows has slightly different error messages as well as frustrating nuances around killing a process :/
## Additional Info
Tested on my Windows VM and it appears to work, also compiled & tested on Linux with these changes. Permissions look correct on both platforms now. Just waiting for my validator to activate on Prater so I can test running full validator client on windows.
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
The latest version of Rust has new clippy rules & the codebase isn't up to date with them.
## Proposed Changes
Small formatting changes that clippy tells me are functionally equivalent
## Issue Addressed
#2276
## Proposed Changes
Add the `SensitiveUrl` struct which wraps `Url` and implements custom `Display` and `Debug` traits to redact user secrets from being logged in eth1 endpoints, beacon node endpoints and metrics.
## Additional Info
This also includes a small rewrite of the eth1 crate to make requests using `Url` instead of `&str`.
Some error messages have also been changed to remove `Url` data.
## Issue Addressed
NA
## Proposed Changes
Fixes a compile error when using the `milagro` feature. I can't see any need to use the specific BLST object here. @pawanjay176 can you please confirm?
## Additional Info
NA