lighthouse

Author	SHA1	Message	Date
Mac L	7c23e2142a	Allow custom certificates when connecting to BN (#2703 ) ## Issue Addressed Resolves #2262 ## Proposed Changes Add a new CLI flag `--beacon-nodes-tls-certs` which allows the user to specify a path to a certificate file (or a list of files, separated by commas). The VC will then use these certificates (in addition to the existing certificates in the OS trust store) when connecting to a beacon node over HTTPS. ## Additional Info This only supports certificates in PEM format.	2021-10-15 00:07:11 +00:00
Michael Sproul	0a77d783a4	Make slashing protection import more resilient (#2598 ) ## Issue Addressed Closes #2419 ## Proposed Changes Address a long-standing issue with the import of slashing protection data where the import would fail due to the data appearing slashable w.r.t the existing database. Importing is now idempotent, and will have no issues importing data that has been handed back and forth between different validator clients, or different implementations. The implementation works by updating the high and low watermarks if they need updating, and not attempting to check if the input is slashable w.r.t itself or the database. This is a strengthening of the minification that we started to do by default since #2380, and what Teku has been doing since the beginning. ## Additional Info The only feature we lose by doing this is the ability to do non-minified imports of clock drifted messages (cf. Prysm on Medalla). In theory, with the previous implementation we could import all the messages in case of clock drift and be aware of the "gap" between the real present time and the messages signed in the far future. _However_ for attestations this is close to useless, as the source epoch will advance as soon as justification occurs, which will require us to make slashable attestations with respect to our bogus attestation(s). E.g. if I sign an attestation 100=>200 when the current epoch is 101, then I won't be able to vote in any epochs prior to 101 becoming justified because 101=>102, 101=>103, etc are all surrounded by 100=>200. Seeing as signing attestations gets blocked almost immediately in this case regardless of our import behaviour, there's no point trying to handle it. For blocks the situation is more hopeful due to the lack of surrounds, but losing block proposals from validators who by definition can't attest doesn't seem like an issue (the other block proposers can pick up the slack).	2021-10-13 01:49:51 +00:00
Mac L	a73d698e30	Add TLS capability to the beacon node HTTP API (#2668 ) Currently, the beacon node has no ability to serve the HTTP API over TLS. Adding this functionality would be helpful for certain use cases, such as when you need a validator client to connect to a backup beacon node which is outside your local network, and the use of an SSH tunnel or reverse proxy would be inappropriate. ## Proposed Changes - Add three new CLI flags to the beacon node - `--http-enable-tls`: enables TLS - `--http-tls-cert`: to specify the path to the certificate file - `--http-tls-key`: to specify the path to the key file - Update the HTTP API to optionally use `warp`'s [`TlsServer`](https://docs.rs/warp/0.3.1/warp/struct.TlsServer.html) depending on the presence of the `--http-enable-tls` flag - Update tests and docs - Use a custom branch for `warp` to ensure proper error handling ## Additional Info Serving the API over TLS should currently be considered experimental. The reason for this is that it uses code from an [unmerged PR](https://github.com/seanmonstar/warp/pull/717). This commit provides the `try_bind_with_graceful_shutdown` method to `warp`, which is helpful for controlling error flow when the TLS configuration is invalid (cert/key files don't exist, incorrect permissions, etc). I've implemented the same code in my [branch here](https://github.com/macladson/warp/tree/tls). Once the code has been reviewed and merged upstream into `warp`, we can remove the dependency on my branch and the feature can be considered more stable. Currently, the private key file must not be password-protected in order to be read into Lighthouse.	2021-10-12 03:35:49 +00:00
Squirrel	db4d72c4f1	Remove unused deps (#2592 ) Found some deps you're possibly not using. Please shout if you think they are indeed still needed.	2021-09-30 04:31:42 +00:00
Michael Sproul	c0122e1a52	Refine VC->BN config check (#2636 ) ## Proposed Changes Instead of checking for strict equality between a BN's spec and the VC's local spec, just check the genesis fork version. This prevents us from failing eagerly for minor differences, while still protecting the VC from connecting to a completely incompatible BN. A warning is retained for the previous case where the specs are not exactly equal, which is to be expected if e.g. running against Infura before Infura configures the mainnet Altair fork epoch.	2021-09-27 04:22:07 +00:00
Paul Hauner	fe52322088	Implement SSZ union type (#2579 ) ## Issue Addressed NA ## Proposed Changes Implements the "union" type from the SSZ spec for `ssz`, `ssz_derive`, `tree_hash` and `tree_hash_derive` so it may be derived for `enums`: https://github.com/ethereum/consensus-specs/blob/v1.1.0-beta.3/ssz/simple-serialize.md#union The union type is required for the merge, since the `Transaction` type is defined as a single-variant union `Union[OpaqueTransaction]`. ### Crate Updates This PR will (hopefully) cause CI to publish new versions for the following crates: - `eth2_ssz_derive`: `0.2.1` -> `0.3.0` - `eth2_ssz`: `0.3.0` -> `0.4.0` - `eth2_ssz_types`: `0.2.0` -> `0.2.1` - `tree_hash`: `0.3.0` -> `0.4.0` - `tree_hash_derive`: `0.3.0` -> `0.4.0` These these crates depend on each other, I've had to add a workspace-level `[patch]` for these crates. A follow-up PR will need to remove this patch, ones the new versions are published. ### Union Behaviors We already had SSZ `Encode` and `TreeHash` derive for enums, however it just did a "transparent" pass-through of the inner value. Since the "union" decoding from the spec is in conflict with the transparent method, I've required that all `enum` have exactly one of the following enum-level attributes: #### SSZ - `#[ssz(enum_behaviour = "union")]` - matches the spec used for the merge - `#[ssz(enum_behaviour = "transparent")]` - maintains existing functionality - not supported for `Decode` (never was) #### TreeHash - `#[tree_hash(enum_behaviour = "union")]` - matches the spec used for the merge - `#[tree_hash(enum_behaviour = "transparent")]` - maintains existing functionality This means that we can maintain the existing transparent behaviour, but all existing users will get a compile-time error until they explicitly opt-in to being transparent. ### Legacy Option Encoding Before this PR, we already had a union-esque encoding for `Option<T>`. However, this was with the old SSZ spec where the union selector was 4 bytes. During merge specification, the spec was changed to use 1 byte for the selector. Whilst the 4-byte `Option` encoding was never used in the spec, we used it in our database. Writing a migrate script for all occurrences of `Option` in the database would be painful, especially since it's used in the `CommitteeCache`. To avoid the migrate script, I added a serde-esque `#[ssz(with = "module")]` field-level attribute to `ssz_derive` so that we can opt into the 4-byte encoding on a field-by-field basis. The `ssz::legacy::four_byte_impl!` macro allows a one-liner to define the module required for the `#[ssz(with = "module")]` for some `Option<T> where T: Encode + Decode`. Notably, I have removed `Encode` and `Decode` impls for `Option`. I've done this to force a break on downstream users. Like I mentioned, `Option` isn't used in the spec so I don't think it'll be that annoying. I think it's nicer than quietly having two different union implementations or quietly breaking the existing `Option` impl. ### Crate Publish Ordering I've modified the order in which CI publishes crates to ensure that we don't publish a crate without ensuring we already published a crate that it depends upon. ## TODO - [ ] Queue a follow-up `[patch]`-removing PR.	2021-09-25 05:58:36 +00:00
Paul Hauner	c5c7476518	Web3Signer support for VC (#2522 ) [EIP-3030]: https://eips.ethereum.org/EIPS/eip-3030 [Web3Signer]: https://consensys.github.io/web3signer/web3signer-eth2.html ## Issue Addressed Resolves #2498 ## Proposed Changes Allows the VC to call out to a [Web3Signer] remote signer to obtain signatures. ## Additional Info ### Making Signing Functions `async` To allow remote signing, I needed to make all the signing functions `async`. This caused a bit of noise where I had to convert iterators into `for` loops. In `duties_service.rs` there was a particularly tricky case where we couldn't hold a write-lock across an `await`, so I had to first take a read-lock, then grab a write-lock. ### Move Signing from Core Executor Whilst implementing this feature, I noticed that we signing was happening on the core tokio executor. I suspect this was causing the executor to temporarily lock and occasionally trigger some HTTP timeouts (and potentially SQL pool timeouts, but I can't verify this). Since moving all signing into blocking tokio tasks, I noticed a distinct drop in the "atttestations_http_get" metric on a Prater node: ![http_get_times](https://user-images.githubusercontent.com/6660660/132143737-82fd3836-2e7e-445b-a143-cb347783baad.png) I think this graph indicates that freeing the core executor allows the VC to operate more smoothly. ### Refactor TaskExecutor I noticed that the `TaskExecutor::spawn_blocking_handle` function would fail to spawn tasks if it were unable to obtain handles to some metrics (this can happen if the same metric is defined twice). It seemed that a more sensible approach would be to keep spawning tasks, but without metrics. To that end, I refactored the function so that it would still function without metrics. There are no other changes made. ## TODO - [x] Restructure to support multiple signing methods. - [x] Add calls to remote signer from VC. - [x] Documentation - [x] Test all endpoints - [x] Test HTTPS certificate - [x] Allow adding remote signer validators via the API - [x] Add Altair support via [21.8.1-rc1](https://github.com/ConsenSys/web3signer/releases/tag/21.8.1-rc1) - [x] Create issue to start using latest version of web3signer. (See #2570) ## Notes - ~~Web3Signer doesn't yet support the Altair fork for Prater. See https://github.com/ConsenSys/web3signer/issues/423.~~ - ~~There is not yet a release of Web3Signer which supports Altair blocks. See https://github.com/ConsenSys/web3signer/issues/391.~~	2021-09-16 03:26:33 +00:00
Pawan Dhananjay	6f18f95893	Update file permissions (#2499 ) ## Issue Addressed Resolves #2438 Resolves #2437 ## Proposed Changes Changes the permissions for validator client http server api token file and secret key to 600 from 644. Also changes the permission for logfiles generated using the `--logfile` cli option to 600. Logs the path to the api token instead of the actual api token. Updates docs to reflect the change.	2021-09-03 02:41:10 +00:00
realbigsean	50321c6671	Updates to make crates publishable (#2472 ) ## Issue Addressed Related to: #2259 Made an attempt at all the necessary updates here to publish the crates to crates.io. I incremented the minor versions on all the crates that have been previously published. We still might run into some issues as we try to publish because I'm not able to test this out but I think it's a good starting point. ## Proposed Changes - Add description and license to `ssz_types` and `serde_util` - rename `serde_util` to `eth2_serde_util` - increment minor versions - remove path dependencies - remove patch dependencies ## Additional Info Crates published: - [x] `tree_hash` -- need to publish `tree_hash_derive` and `eth2_hashing` first - [x] `eth2_ssz_types` -- need to publish `eth2_serde_util` first - [x] `tree_hash_derive` - [x] `eth2_ssz` - [x] `eth2_ssz_derive` - [x] `eth2_serde_util` - [x] `eth2_hashing` Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-09-03 01:10:25 +00:00
ladidan	beab306e07	Fix log output for INFO Found no doppelganger (#2551 ) ## Issue Addressed log output "INFO Found no doppelganger validator_index: 11111, epoch: 11111, further_checks_remaining: 0, service: doppelganger" whereby validator_index = epoch ## Proposed Changes epoch = current epoch	2021-08-29 23:29:47 +00:00
Paul Hauner	12fe72bd37	Always require auth header in VC (#2517 ) ## Issue Addressed - Resolves #2512 ## Proposed Changes Enforces that all routes require an auth token for the VC. ## TODO - [x] Tests	2021-08-18 01:31:28 +00:00
Michael Sproul	c0a2f501d9	Upgrade dependencies (#2513 ) ## Proposed Changes * Consolidate Tokio versions: everything now uses the latest v1.10.0, no more `tokio-compat`. * Many semver-compatible changes via `cargo update`. Notably this upgrades from the yanked v0.8.0 version of crossbeam-deque which is present in v1.5.0-rc.0 * Many semver incompatible upgrades via `cargo upgrades` and `cargo upgrade --workspace pkg_name`. Notable ommissions: - Prometheus, to be handled separately: https://github.com/sigp/lighthouse/issues/2485 - `rand`, `rand_xorshift`: the libsecp256k1 package requires 0.7.x, so we'll stick with that for now - `ethereum-types` is pinned at 0.11.0 because that's what `web3` is using and it seems nice to have just a single version ## Additional Info We still have two versions of `libp2p-core` due to `discv5` depending on the v0.29.0 release rather than `master`. AFAIK it should be OK to release in this state (cc @AgeManning )	2021-08-17 01:00:24 +00:00
Paul Hauner	ff85b05249	Add docs for doppelganger protection (#2496 ) ## Issue Addressed NA ## Proposed Changes - Adds docs for Doppelganger Protection - Shortens a log message since it was a bit longer than our usual formatting. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2021-08-06 02:13:16 +00:00
Paul Hauner	71ab16e404	Register vals with doppelganger earlier (#2494 ) ## Issue Addressed NA ## Proposed Changes Registers validators with the doppelganger service at the earliest possible point. This avoids the following (non-harmful, but scary) log when pruning the slashing DB on startup: ``` CRIT Validator unknown to doppelganger service, pubkey: 0xabc..., msg: preventing validator from performing duties, service: doppelganger ``` ## Additional Info NA	2021-08-06 02:13:15 +00:00
Michael Sproul	17a2c778e3	Altair validator client and HTTP API (#2404 ) ## Proposed Changes * Implement the validator client and HTTP API changes necessary to support Altair Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-08-06 00:47:31 +00:00
Pawan Dhananjay	350b6f19de	Print only unique doppelgangers (#2500 ) ## Issue Addressed N/A ## Proposed Changes This is just a cosmetic change to print only the unique list of violaters. We could repeat violaters in the list if an attestation and aggregation both were detected from the same validator.	2021-08-05 22:27:40 +00:00
Paul Hauner	6a620a31da	Fix starting-epoch check in doppelganger (#2491 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug in Doppelganger Protection which would cause false-positives when a VC is restarted in the same epoch where it has already produced a signed message. It could also cause a false-negative in the scenario where time skips forward (perhaps due to host suspend/wake). The new `time_skips_forward_with_doppelgangers` test covers this case. This was a simple (and embarrassing, on my behalf) `>=` instead of `<=` bug that was missed by my tests but detected during manual testing by @michaelsproul (🙏). Regression tests have been added. ## Additional Info NA ## TODO - [x] Add test for doppelganger in epoch > next_check_epoch	2021-08-04 00:03:47 +00:00
realbigsean	c5786a8821	Doppelganger detection (#2230 ) ## Issue Addressed Resolves #2069 ## Proposed Changes - Adds a `--doppelganger-detection` flag - Adds a `lighthouse/seen_validators` endpoint, which will make it so the lighthouse VC is not interopable with other client beacon nodes if the `--doppelganger-detection` flag is used, but hopefully this will become standardized. Relevant Eth2 API repo issue: https://github.com/ethereum/eth2.0-APIs/issues/64 - If the `--doppelganger-detection` flag is used, the VC will wait until the beacon node is synced, and then wait an additional 2 epochs. The reason for this is to make sure the beacon node is able to subscribe to the subnets our validators should be attesting on. I think an alternative would be to have the beacon node subscribe to all subnets for 2+ epochs on startup by default. ## Additional Info I'd like to add tests and would appreciate feedback. TODO: handle validators started via the API, potentially make this default behavior Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io> Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-07-31 03:50:52 +00:00
realbigsean	303deb9969	Rust 1.54.0 lints (#2483 ) ## Issue Addressed N/A ## Proposed Changes - Removing a bunch of unnecessary references - Updated `Error::VariantError` to `Error::Variant` - There were additional enum variant lints that I ignored, because I thought our variant names were fine - removed `MonitoredValidator`'s `pubkey` field, because I couldn't find it used anywhere. It looks like we just use the string version of the pubkey (the `id` field) if there is no index ## Additional Info Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-07-30 01:11:47 +00:00
Mac L	17b6d7ce86	Add `http-address` flag to VC (#2467 ) ## Issue Addressed #2454 ## Proposed Changes Adds the `--http-address` flag to allow the user to use custom HTTP addresses. This can be helpful for certain Docker setups. Since using custom HTTP addresses is unsafe due to the server being unencrypted, `--unencrypted-http-transport` was also added as a safety flag and must be used in tandem with `--http-address`. This is to ensure the user is aware of the risks associated with using non-local HTTP addresses.	2021-07-21 07:10:51 +00:00
Age Manning	c1d2e35c9e	Bleeding edge discovery (#2435 ) * Update discovery banning logic and tokio * Update to latest discovery * Shift to latest discovery * Fmt	2021-07-15 16:43:17 +10:00
Michael Sproul	8fa6e463ca	Update direct libsecp256k1 dependencies (#2456 ) ## Proposed Changes * Remove direct dependencies on vulnerable `libsecp256k1 0.3.5` * Ignore the RUSTSEC issue until it is resolved in #2389	2021-07-14 05:24:10 +00:00
Mac L	b3c7e59a5b	Adjust beacon node timeouts for validator client HTTP requests (#2352 ) ## Issue Addressed Resolves #2313 ## Proposed Changes Provide `BeaconNodeHttpClient` with a dedicated `Timeouts` struct. This will allow granular adjustment of the timeout duration for different calls made from the VC to the BN. These can either be a constant value, or as a ratio of the slot duration. Improve timeout performance by using these adjusted timeout duration's only whenever a fallback endpoint is available. Add a CLI flag called `use-long-timeouts` to revert to the old behavior. ## Additional Info Additionally set the default `BeaconNodeHttpClient` timeouts to the be the slot duration of the network, rather than a constant 12 seconds. This will allow it to adjust to different network specifications. Co-authored-by: Paul Hauner <paul@paulhauner.com>	2021-07-12 01:47:48 +00:00
Michael Sproul	b4689e20c6	Altair consensus changes and refactors (#2279 ) ## Proposed Changes Implement the consensus changes necessary for the upcoming Altair hard fork. ## Additional Info This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-07-09 06:15:32 +00:00
Paul Hauner	78e5c0c157	Capture a missed VC error (#2436 ) ## Issue Addressed Related to #2430, #2394 ## Proposed Changes As per https://github.com/sigp/lighthouse/issues/2430#issuecomment-875323615, ensure that the `ProductionValidatorClient::new` error raises a log and shuts down the VC. Also, I implemened `spawn_ignoring_error`, as per @michaelsproul's suggestion in https://github.com/sigp/lighthouse/pull/2436#issuecomment-876084419. I got unlucky and CI picked up a [new rustsec vuln](https://rustsec.org/advisories/RUSTSEC-2021-0072). To fix this, I had to update the following crates: - `tokio` - `web3` - `tokio-compat-02` ## Additional Info NA	2021-07-09 03:20:24 +00:00
Michael Sproul	6583ce325b	Minify slashing protection interchange data (#2380 ) ## Issue Addressed Closes #2354 ## Proposed Changes Add a `minify` method to `slashing_protection::Interchange` that keeps only the maximum-epoch attestation and maximum-slot block for each validator. Specifically, `minify` constructs "synthetic" attestations (with no `signing_root`) containing the maximum source epoch _and_ the maximum target epoch from the input. This is equivalent to the `minify_synth` algorithm that I've formally verified in this repository: https://github.com/michaelsproul/slashing-proofs ## Additional Info Includes the JSON loading optimisation from #2347	2021-06-21 05:46:36 +00:00
realbigsean	b84ff9f793	rust 1.53.0 updates (#2411 ) ## Issue Addressed `make lint` failing on rust 1.53.0. ## Proposed Changes 1.53.0 updates ## Additional Info I haven't figure out why yet, we were now hitting the recursion limit in a few crates. So I had to add `#![recursion_limit = "256"]` in a few places Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-06-18 05:58:01 +00:00
Pawan Dhananjay	fdaeec631b	Monitoring service api (#2251 ) ## Issue Addressed N/A ## Proposed Changes Adds a client side api for collecting system and process metrics and pushing it to a monitoring service.	2021-05-26 05:58:41 +00:00
ethDreamer	ba55e140ae	Enable Compatibility with Windows (#2333 ) ## Issue Addressed Windows incompatibility. ## Proposed Changes On windows, lighthouse needs to default to STDIN as tty doesn't exist. Also Windows uses ACLs for file permissions. So to mirror chmod 600, we will remove every entry in a file's ACL and add only a single SID that is an alias for the file owner. Beyond that, there were several changes made to different unit tests because windows has slightly different error messages as well as frustrating nuances around killing a process :/ ## Additional Info Tested on my Windows VM and it appears to work, also compiled & tested on Linux with these changes. Permissions look correct on both platforms now. Just waiting for my validator to activate on Prater so I can test running full validator client on windows. Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com> Co-authored-by: Michael Sproul <micsproul@gmail.com>	2021-05-19 23:05:16 +00:00
Michael Sproul	58e52f8f40	Write validator definitions atomically (#2338 ) ## Issue Addressed Closes https://github.com/sigp/lighthouse/issues/2159 ## Proposed Changes Rather than trying to write the validator definitions to disk directly, use a temporary file called `.validator_defintions.yml.tmp` and then atomically rename it to `validator_definitions.yml`. This avoids truncating the primary file, which can cause permanent damage when the disk is full. The same treatment is also applied to the validator key cache, although the situation is less dire if it becomes corrupted because it can just be deleted without the user having to reimport keys or resupply passwords. ## Additional Info * `File::create` truncates upon opening: https://doc.rust-lang.org/std/fs/struct.File.html#method.create * `fs::rename` uses `rename` on UNIX and `MoveFileEx` on Windows: https://doc.rust-lang.org/std/fs/fn.rename.html * UNIX `rename` call is atomic: https://unix.stackexchange.com/questions/322038/is-mv-atomic-on-my-fs * Windows `MoveFileEx` is _not_ atomic in general, and Windows lacks any clear API for atomic file renames :( https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows ## Further Work * Consider whether we want to try a different Windows syscall as part of #2333. The `rust-atomicwrites` crate seems promising, but actually uses the same syscall under the hood presently: https://github.com/untitaker/rust-atomicwrites/issues/27.	2021-05-12 02:04:44 +00:00
ethDreamer	cb47388ad7	Updated to comply with new clippy formatting rules (#2336 ) ## Issue Addressed The latest version of Rust has new clippy rules & the codebase isn't up to date with them. ## Proposed Changes Small formatting changes that clippy tells me are functionally equivalent	2021-05-10 00:53:09 +00:00
Mac L	4cc613d644	Add `SensitiveUrl` to redact user secrets from endpoints (#2326 ) ## Issue Addressed #2276 ## Proposed Changes Add the `SensitiveUrl` struct which wraps `Url` and implements custom `Display` and `Debug` traits to redact user secrets from being logged in eth1 endpoints, beacon node endpoints and metrics. ## Additional Info This also includes a small rewrite of the eth1 crate to make requests using `Url` instead of `&str`. Some error messages have also been changed to remove `Url` data.	2021-05-04 01:59:51 +00:00
Paul Hauner	52995ab5f5	Use generic BLS object instead of BLST (#2290 ) ## Issue Addressed NA ## Proposed Changes Fixes a compile error when using the `milagro` feature. I can't see any need to use the specific BLST object here. @pawanjay176 can you please confirm? ## Additional Info NA	2021-04-02 23:34:17 +00:00
Michael Sproul	f9d60f5436	VC: accept unknown fields in chain spec (#2277 ) ## Issue Addressed Closes #2274 ## Proposed Changes * Modify the `YamlConfig` to collect unknown fields into an `extra_fields` map, instead of failing hard. * Log a debug message if there are extra fields returned to the VC from one of its BNs. This restores Lighthouse's compatibility with Teku beacon nodes (and therefore Infura)	2021-03-26 04:53:57 +00:00
Paul Hauner	015ab7d0a7	Optimize validator duties (#2243 ) ## Issue Addressed Closes #2052 ## Proposed Changes - Refactor the attester/proposer duties endpoints in the BN - Performance improvements - Fixes some potential inconsistencies with the dependent root fields. - Removes `http_api::beacon_proposer_cache` and just uses the one on the `BeaconChain` instead. - Move the code for the proposer/attester duties endpoints into separate files, for readability. - Refactor the `DutiesService` in the VC - Required to reduce the delay on broadcasting new blocks. - Gets rid of the `ValidatorDuty` shim struct that came about when we adopted the standard API. - Separate block/attestation duty tasks so that they don't block each other when one is slow. - In the VC, use `PublicKeyBytes` to represent validators instead of `PublicKey`. `PublicKey` is a legit crypto object whilst `PublicKeyBytes` is just a byte-array, it's much faster to clone/hash `PublicKeyBytes` and this change has had a significant impact on runtimes. - Unfortunately this has created lots of dust changes. - In the BN, store `PublicKeyBytes` in the `beacon_proposer_cache` and allow access to them. The HTTP API always sends `PublicKeyBytes` over the wire and the conversion from `PublicKey` -> `PublickeyBytes` is non-trivial, especially when queries have 100s/1000s of validators (like Pyrmont). - Add the `state_processing::state_advance` mod which dedups a lot of the "apply `n` skip slots to the state" code. - This also fixes a bug with some functions which were failing to include a state root as per [this comment](`072695284f/consensus/state_processing/src/state_advance.rs (L69-L74)`). I couldn't find any instance of this bug that resulted in anything more severe than keying a shuffling cache by the wrong block root. - Swap the VC block service to use `mpsc` from `tokio` instead of `futures`. This is consistent with the rest of the code base. ~~This PR reduces the size of the codebase 🎉~~ It used to reduce the size of the code base before I added more comments. ## Observations on Prymont - Proposer duties times down from peaks of 450ms to consistent <1ms. - Current epoch attester duties times down from >1s peaks to a consistent 20-30ms. - Block production down from +600ms to 100-200ms. ## Additional Info - ~~Blocked on #2241~~ - ~~Blocked on #2234~~ ## TODO - [x] ~~Refactor this into some smaller PRs?~~ Leaving this as-is for now. - [x] Address `per_slot_processing` roots. - [x] Investigate slow next epoch times. Not getting added to cache on block processing? - [x] Consider [this](`072695284f/beacon_node/store/src/hot_cold_store.rs (L811-L812)`) in the scenario of replacing the state roots Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-03-17 05:09:57 +00:00
Pawan Dhananjay	da8791abd7	Set graffiti per validator (#2044 ) ## Issue Addressed Resolves #1944 ## Proposed Changes Adds a "graffiti" key to the `validator_definitions.yml`. Setting the key will override anything passed through the validator `--graffiti` flag. Returns an error if the value for the graffiti key is > 32 bytes instead of silently truncating.	2021-03-02 22:35:46 +00:00
Michael Sproul	afd4786c59	Prune slashing protection DB (#2194 ) ## Proposed Changes Prune the slashing protection database so that it doesn't exhibit unbounded growth. Prune by dropping attestations and blocks from more than 512 epochs ago, relying on the guards that prevent signing messages with slots or epochs less than the minimum recorded in the DB. The pruning process is potentially time consuming, so it's scheduled to run only every 512 epochs, in the last 2/3rds of a slot. This gives it at least 4 seconds to run without impacting other signing, which I think should be sufficient. I've seen it run for several minutes (yikes!) on our Pyrmont nodes, but I suspect that 1) this will only occur on the first run when the database is still huge 2) no other production users will be impacted because they don't have enough validators per node. Pruning also happens at start-up, as I figured this is a fairly infrequent event, and if a user is experiencing problems with the VC related to pruning, it's nice to be able to trigger it with a quick restart. Users are also conditioned to not mind missing a few attestations during a restart. We need to include a note in the release notes that users may see the message `timed out waiting for connection` the first time they prune a huge database, but that this is totally fine and to be expected (the VC will miss those attestations in the meantime). I'm also open to making this opt-in for now, although the sooner we get users doing it, the less painful it will be: prune early, prune often!	2021-02-24 23:51:04 +00:00
Paul Hauner	f8cc82f2b1	Switch back to warp with cors wildcard support (#2211 ) ## Issue Addressed - Resolves #2204 - Resolves #2205 ## Proposed Changes Switches to my fork of `warp` which contains support for cors wildcards: https://github.com/paulhauner/warp/tree/cors-wildcard I have a PR open on the `warp` repo but it hasn't had any interest from the maintainers as of yet: https://github.com/seanmonstar/warp/pull/726. I think running from a fork is the best we can do for now. ## Additional Info NA	2021-02-18 22:33:12 +00:00
Paul Hauner	8e5c20b6d1	Update for clippy 1.50 (#2193 ) ## Issue Addressed NA ## Proposed Changes Rust 1.50 has landed 🎉 The shiny new `clippy` peers down upon us mere mortals with disgust. Brutish peasants wrapping our `usize`s in superfluous `Option`s... tsk tsk. I've performed the goat sacrifice and corrected our evil ways in this PR. Tonight we shall pray that Github Actions bestows the almighty green tick upon us. ## Additional Info NA Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-02-15 00:09:12 +00:00
realbigsean	e20f64b21a	Update to tokio 1.1 (#2172 ) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. Edit: I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-02-10 23:29:49 +00:00
Akihito Nakano	1a22a096c6	Fix clippy errors on tests (#2160 ) ## Issue Addressed There are some clippy error on tests. ## Proposed Changes Enable clippy check on tests and fix the errors. 💪	2021-01-28 23:31:06 +00:00
Paul Hauner	d9f940613f	Represent slots in secs instead of millisecs (#2163 ) ## Issue Addressed NA ## Proposed Changes Copied from #2083, changes the config milliseconds_per_slot to seconds_per_slot to avoid errors when slot duration is not a multiple of a second. To avoid deserializing old serialized data (with milliseconds instead of seconds) the Serialize and Deserialize derive got removed from the Spec struct (isn't currently used anyway). This PR replaces #2083 for the purpose of fixing a merge conflict without requiring the input of @blacktemplar. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2021-01-19 09:39:51 +00:00
Akihito Nakano	a8d040c821	Fix timing issue in obtaining the Fork (#2158 ) ## Issue Addressed Related PR: https://github.com/sigp/lighthouse/pull/2137#issuecomment-754712492 The Fork is required for VC to perform signing. Currently, it is not guaranteed that the Fork has been obtained at the point of the signing as the Fork is obtained at after ForkService starts. We will see the [error](`851a4dca3c/validator_client/src/validator_store.rs (L127)`) if VC could not perform the signing due to the timing issue. > Unable to get Fork for signing ## Proposed Changes Obtain the Fork on `init_from_beacon_node` to fix the timing issue.	2021-01-19 02:54:18 +00:00
realbigsean	7a71977987	Clippy 1.49.0 updates and dht persistence test fix (#2156 ) ## Issue Addressed `test_dht_persistence` failing ## Proposed Changes Bind `NetworkService::start` to an underscore prefixed variable rather than `_`. `_` was causing it to be dropped immediately This was failing 5/100 times before this update, but I haven't been able to get it to fail after updating it Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 00:34:28 +00:00
Arthur Woimbée	851a4dca3c	replace tempdir by tempfile (#2143 ) ## Issue Addressed Fixes #2141 Remove [tempdir](https://docs.rs/tempdir/0.3.7/tempdir/) in favor of [tempfile](https://docs.rs/tempfile/3.1.0/tempfile/). ## Proposed Changes `tempfile` has a slightly different api that makes creating temp folders with a name prefix a chore (`tempdir::TempDir::new("toto")` => `tempfile::Builder::new().prefix("toto").tempdir()`). So I removed temp folder name prefix where I deemed it not useful. Otherwise, the functionality is the same.	2021-01-06 06:36:11 +00:00
Paul Hauner	c2eac8e5bd	Remove duplicate log in BN fallback (#2116 ) ## Issue Addressed NA ## Proposed Changes - Removes a duplicated log in the fallback code for the VC. - Updates the text in the remaining de-duped log. ## Additional Info Example ``` Dec 23 05:19:54.003 WARN Beacon node is syncing endpoint: http://xxxx:5052/, head_slot: 88224, sync_distance: 161774 Dec 23 05:19:54.003 WARN Beacon node is not synced endpoint: http://xxxxx:5052/ ```	2021-01-06 03:01:48 +00:00
Pawan Dhananjay	32a60578fe	Remove default beacon node value from clap (#2121 ) ## Issue Addressed Fixes #2118 ## Proposed Changes Removes the default value in clap for `--beacon-nodes`. This was causing issues with cli picking `--beacon-nodes` default even when not specified and overriding `--beacon-node`. Seems like it was more evident with docker setups because it doesn't use the default `http://localhost:5052` option. Edit: we already set the default to `http://localhost:5052` here so this shouldn't break any existing setups. `9ed65a64f8/validator_client/src/config.rs (L58)` ## Additional info Tested this with docker-compose and binaries. Works as expected in both cases.	2020-12-28 08:23:59 +00:00
Paul Hauner	a62dc65ca4	BN Fallback v2 (#2080 ) ## Issue Addressed - Resolves #1883 ## Proposed Changes This follows on from @blacktemplar's work in #2018. - Allows the VC to connect to multiple BN for redundancy. - Update the simulator so some nodes always need to rely on their fallback. - Adds some extra deprecation warnings for `--eth1-endpoint` - Pass `SignatureBytes` as a reference instead of by value. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2020-12-18 09:17:03 +00:00
blacktemplar	701843aaa0	Update dependencies (#2084 ) ## Issue Addressed Partially addresses dependencies mentioned in issue #1712. ## Proposed Changes Updates dependencies (including an update avoiding a vulnerability) + add tokio compatibility to `remote_signer_test`	2020-12-14 02:28:19 +00:00
Michael Sproul	82753f842d	Improve compile time (#1989 ) ## Issue Addressed Closes #1264 ## Proposed Changes * Milagro BLS: tweak the feature flags so that Milagro doesn't get compiled if we're using BLST. Profiling showed that it was consuming about 1 minute of CPU time out of 60 minutes of CPU time (real time ~15 mins). A 1.6% saving. * Reduce monomorphization: compiling for 3 different `EthSpec` types causes a heck of a lot of generic functions to be instantiated (monomorphized). Removing 2 of 3 cuts the LLVM+linking step from around 250 seconds to 180 seconds, a saving of 70 seconds (real time!). This applies only to `make` and not the CI build, because we test with the minimal spec on CI. * Update `web3` crate to v0.13. This is perhaps the most controversial change, because it requires axing some deposit contract tools from `lcli`. I suspect these tools weren't used much anyway, and could be maintained separately, but I'm also happy to revert this change. However, it does save us a lot of compile time. With #1839, we now have 3 versions of Tokio (and all of Tokio's deps). This change brings us down to 2 versions, but 1 should be achievable once web3 (and reqwest) move to Tokio 0.3. * Remove `lcli` from the Docker image. It's a dev tool and can be built from the repo if required.	2020-12-09 01:34:58 +00:00

1 2 3 4 5 ...

365 Commits