lighthouse

Author	SHA1	Message	Date
Pawan Dhananjay	6e6e9104f5	Prevent adding duplicate validators to validator_definitions.yml (#2166 ) ## Issue Addressed N/A ## Proposed Changes This is mostly a UX improvement. Currently, when recursively finding keystores, we only ignore keystores with same path.This leads to potential issues while copying datadirs (e.g. copying datadir to a new ssd with more storage). After copying new datadir and starting the vc, we will discover the copied keystores as new keystores and add it to the definitions file leading to duplicate entries. This PR avoids duplicate keystores being discovered as new keystore by checking for duplicate pubkeys as well.	2021-02-15 06:09:51 +00:00
Paul Hauner	8e5c20b6d1	Update for clippy 1.50 (#2193 ) ## Issue Addressed NA ## Proposed Changes Rust 1.50 has landed 🎉 The shiny new `clippy` peers down upon us mere mortals with disgust. Brutish peasants wrapping our `usize`s in superfluous `Option`s... tsk tsk. I've performed the goat sacrifice and corrected our evil ways in this PR. Tonight we shall pray that Github Actions bestows the almighty green tick upon us. ## Additional Info NA Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-02-15 00:09:12 +00:00
Michael Sproul	e2ff9c66a1	Remove links to old master branch (#2190 ) ## Proposed Changes In preparation for deleting the `master` branch, remove all links to it from the book/README.	2021-02-11 06:06:54 +00:00
realbigsean	e20f64b21a	Update to tokio 1.1 (#2172 ) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. Edit: I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-02-10 23:29:49 +00:00
Michael Sproul	6f4da9a5d2	Check that pull requests target unstable (#2187 ) Attempt to prevent accidental merges to `stable` due to GitHub's default behaviour of opening PRs against it. I've intentionally opened this PR against `stable` to test the functionality ;)	2021-02-09 02:00:53 +00:00
Paul Hauner	7c059117f4	Avoid resizing attn signature sets vec (#2184 ) ## Issue Addressed NA ## Proposed Changes Reduces allocations by initializing the `pubkeys` vec to its final size. I doubt this will make a substantial difference, but it's nice to do it this way. Seeing as `indexed_attestation.attesting_indices` has a [fixed length](`e4b62139d7/consensus/types/src/indexed_attestation.rs (L22)`), there's no real risk of a memory blow-up by pre-allocating the size of the `Vec`. ## Additional Info NA	2021-02-09 02:00:51 +00:00
Paul Hauner	194609d210	Ignore vulnerability in hyper (#2188 ) ## Issue Addressed NA ## Proposed Changes Ignores a [hyper vuln](https://rustsec.org/advisories/RUSTSEC-2021-0020) that will be fixed in #2172. I am comfortable with ignoring this because we have a fix in the works and the impact of the vuln is low to negligible. ## Additional Info NA	2021-02-08 23:41:22 +00:00
Paul Hauner	e383ef3e91	Avoid temp allocations with slog (#2183 ) ## Issue Addressed Which issue # does this PR address? ## Proposed Changes Replaces use of `format!` in `slog` logging with it's special no-allocation `?` and `%` shortcuts. According to a `heaptrack` analysis today over about a period of an hour, this will reduce temporary allocations by at least 4%. ## Additional Info NA	2021-02-04 07:31:47 +00:00
Paul Hauner	ff35fbb121	Add metrics for beacon block propagation (#2173 ) ## Issue Addressed NA ## Proposed Changes Adds some metrics to track delays regarding: - LH processing of blocks - delays receiving blocks from other nodes. ## Additional Info NA	2021-02-04 05:33:56 +00:00
Guillaume Ballet	de193c95d3	fix a couple typos in comments in merkle_hasher (#2171 ) Found what I believe to be a couple typos in the comments as I was going through the merkleization code.	2021-02-03 04:52:22 +00:00
Pawan Dhananjay	420c2d28f8	Fix simulator failed runs (#2181 ) ## Issue Addressed N/A ## Proposed Changes Another attempt at fixing simulator issues for `eth1-sim`. The `LocalValidatorClient` here blocks till genesis has occurred. `e4b62139d7/testing/simulator/src/local_network.rs (L145-L150)` Due to this, only the first validator(validator_0) starts before genesis. The remaining 3 vc's in the simulation start only after genesis. This was probably causing issues with missing the duties and eventually the proposal for slot 1. This PR spawns each `LocalValidatorClient` in it's own tokio task to allow the remaining validators to start before genesis. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2021-02-01 03:31:12 +00:00
Akihito Nakano	1a22a096c6	Fix clippy errors on tests (#2160 ) ## Issue Addressed There are some clippy error on tests. ## Proposed Changes Enable clippy check on tests and fix the errors. 💪	2021-01-28 23:31:06 +00:00
Paul Hauner	e4b62139d7	v1.1.0 (#2168 ) ## Issue Addressed NA ## Proposed Changes - Bump version - ~~Run `cargo update`~~ ## Additional Info NA	2021-01-21 02:37:08 +00:00
Paul Hauner	2b2a358522	Detailed validator monitoring (#2151 ) ## Issue Addressed - Resolves #2064 ## Proposed Changes Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators. Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint. Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored. See the new docs included in this PR for more info. ## TODO - [x] Track validator balance, `slashed` status, etc. - [x] ~~Register slashings in current epoch, not offense epoch~~ - [ ] Publish Grafana dashboard, update TODO link in docs - [x] ~~#2130 is merged into this branch, resolve that~~	2021-01-20 19:19:38 +00:00
Paul Hauner	1eb0915301	Fix bug from #2163 (#2165 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug that I missed during a review in #2163. I found this bug by observing that nodes were receiving far less attestations (~1/2 of previous). I'm not certain on exactly how this mistake manifested in a reduction in attestations, but the mistake touches so much code that I think it's reasonable to declare that this it the cause of the observed issue (drop in attestations). ## Additional Info NA	2021-01-20 10:28:12 +00:00
Paul Hauner	b06559ae97	Disallow attestation production earlier than head (#2130 ) ## Issue Addressed The non-finality period on Pyrmont between epochs [`9114`](https://pyrmont.beaconcha.in/epoch/9114) and [`9182`](https://pyrmont.beaconcha.in/epoch/9182) was contributed to by all the `lighthouse_team` validators going down. The nodes saw excessive CPU and RAM usage, resulting in the system to kill the `lighthouse bn` process. The `Restart=on-failure` directive for `systemd` caused the process to bounce in ~10-30m intervals. Diagnosis with `heaptrack` showed that the `BeaconChain::produce_unaggregated_attestation` function was calling `store::beacon_state::get_full_state` and sometimes resulting in a tree hash cache allocation. These allocations were approximately the size of the hosts physical memory and still allocated when `lighthouse bn` was killed by the OS. There was no CPU analysis (e.g., `perf`), but the `BeaconChain::produce_unaggregated_attestation` is very CPU-heavy so it is reasonable to assume it is the cause of the excessive CPU usage, too. ## Proposed Changes `BeaconChain::produce_unaggregated_attestation` has two paths: 1. Fast path: attesting to the head slot or later. 2. Slow path: attesting to a slot earlier than the head block. Path (2) is the only path that calls `store::beacon_state::get_full_state`, therefore it is the path causing this excessive CPU/RAM usage. This PR removes the current functionality of path (2) and replaces it with a static error (`BeaconChainError::AttestingPriorToHead`). This change reduces the generality of `BeaconChain::produce_unaggregated_attestation` (and therefore [`/eth/v1/validator/attestation_data`](https://ethereum.github.io/eth2.0-APIs/#/Validator/produceAttestationData)), but I argue that this functionality is an edge-case and arguably a violation of the [Honest Validator spec](https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/validator.md). It's possible that a validator goes back to a prior slot to "catch up" and submit some missed attestations. This change would prevent such behaviour, returning an error. My concerns with this catch-up behaviour is that it is: - Not specified as "honest validator" attesting behaviour. - Is behaviour that is risky for slashing (although, all validator clients should have slashing protection and will eventually fail if they do not). - It disguises clock-sync issues between a BN and VC. ## Additional Info It's likely feasible to implement path (2) if we implement some sort of caching mechanism. This would be a multi-week task and this PR gets the issue patched in the short term. I haven't created an issue to add path (2), instead I think we should implement it if we get user-demand.	2021-01-20 06:52:37 +00:00
Paul Hauner	d9f940613f	Represent slots in secs instead of millisecs (#2163 ) ## Issue Addressed NA ## Proposed Changes Copied from #2083, changes the config milliseconds_per_slot to seconds_per_slot to avoid errors when slot duration is not a multiple of a second. To avoid deserializing old serialized data (with milliseconds instead of seconds) the Serialize and Deserialize derive got removed from the Spec struct (isn't currently used anyway). This PR replaces #2083 for the purpose of fixing a merge conflict without requiring the input of @blacktemplar. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2021-01-19 09:39:51 +00:00
Paul Hauner	46cb6e204c	Add lcli command to replace state pubkeys (#1999 ) ## Issue Addressed NA ## Proposed Changes Adds a command to replace all the pubkeys in a state with one generated from a mnemonic. ## Additional Info This is not production code, it's only for testing.	2021-01-19 08:42:30 +00:00
Paul Hauner	805e152f66	Simplify enum -> str with strum (#2164 ) ## Issue Addressed NA ## Proposed Changes As per #2100, uses derives from the sturm library to implement AsRef<str> and AsStaticRef to easily get str values from enums without creating new Strings. Furthermore unifies all attestation error counter into one IntCounterVec vector. These works are originally by @blacktemplar, I've just created this PR so I can resolve some merge conflicts. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2021-01-19 06:33:58 +00:00
Paul Hauner	8892114f52	Modify proto array loop (#2154 ) ## Issue Addressed NA ## Proposed Changes As discussed with @protolambda, add an additional loop inside proto_array to ensure weights are coherent. ## Additional Info NA	2021-01-19 03:50:12 +00:00
realbigsean	51f7724c76	Automate docker version tag (#2150 ) ## Issue Addressed N/A ## Proposed Changes On any tag formatted `v*`, a full multi-arch docker build will be kicked off and automatically pushed to docker hub with the version tag. This is a bit repetitive, because the image built will usually be the same as the image built on pushes to `stable`, but it seems like the simplest way to go about it and this will also work if we incorporate a workflow with `vX.X.X-rc` tags. ## Additional Info This may also need to wait for env variable updates: https://github.com/sigp/lighthouse/pull/2135#issuecomment-754977433 Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 03:50:10 +00:00
Taneli Hukkinen	9cdfa94ba4	Update docs: Change `--beacon-node` to `--beacon-nodes` (#2145 ) ## Issue Addressed The docs use the deprecated `--beacon-node` flag ## Proposed Changes Reference the new `--beacon-nodes` flag in docs	2021-01-19 03:50:08 +00:00
Akihito Nakano	3d07934ca0	Fix: `end_slot` returns incorrect value (#2138 ) ## Issue Addressed `Epoch::end_slot()` returns incorrect value when the epoch is the last epoch which can be represented by u64. ```rust let slots_per_epoch = 32; // The last epoch which can be represented by u64. let epoch = Epoch::new(u64::max_value() / slots_per_epoch); println!("{}", epoch.end_slot(slots_per_epoch)); // Slot(18446744073709551614) // -> correctly, the result should be `Slot(18446744073709551615)`. ```	2021-01-19 03:50:06 +00:00
Akihito Nakano	a8d040c821	Fix timing issue in obtaining the Fork (#2158 ) ## Issue Addressed Related PR: https://github.com/sigp/lighthouse/pull/2137#issuecomment-754712492 The Fork is required for VC to perform signing. Currently, it is not guaranteed that the Fork has been obtained at the point of the signing as the Fork is obtained at after ForkService starts. We will see the [error](`851a4dca3c/validator_client/src/validator_store.rs (L127)`) if VC could not perform the signing due to the timing issue. > Unable to get Fork for signing ## Proposed Changes Obtain the Fork on `init_from_beacon_node` to fix the timing issue.	2021-01-19 02:54:18 +00:00
realbigsean	908c8eadf3	remove protected environment (#2135 ) ## Issue Addressed N/A ## Proposed Changes Remove Github Action environments ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 01:29:06 +00:00
realbigsean	7a71977987	Clippy 1.49.0 updates and dht persistence test fix (#2156 ) ## Issue Addressed `test_dht_persistence` failing ## Proposed Changes Bind `NetworkService::start` to an underscore prefixed variable rather than `_`. `_` was causing it to be dropped immediately This was failing 5/100 times before this update, but I haven't been able to get it to fail after updating it Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 00:34:28 +00:00
Akihito Nakano	e5b1a37110	[simulator] Fix race condition when creating LocalBeaconNode (#2137 ) ## Issue Addressed We have a race condition when counting the number of beacon nodes. The user could end up seeing a duplicated service name (node_N). ## Proposed Changes I have updated to acquire write lock before counting the number of beacon nodes.	2021-01-14 00:04:18 +00:00
Pawan Dhananjay	28238d97b1	Disconnect from peers quicker on internet issues (#2147 ) ## Issue Addressed Fixes #2146 ## Proposed Changes Change ping timeout errors to return `LowToleranceErrors` so that we disconnect faster on internet failures/changes.	2021-01-13 08:09:10 +00:00
realbigsean	14df5d5c32	Use cross in linux x86 64 release flow (#2136 ) ## Issue Addressed Resolves #2120 ## Proposed Changes This updates github actions to use `cross` when compiling linux x86_64 binaries. ## Additional Info I think we could alternatively be explicit with the version of macOS or ubuntu we are running actions on and that could solve #2120. I'm not sure which method is preferred here though. Github actions supports Ubuntu 16.04 Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-12 06:38:22 +00:00
Paul Hauner	1d535659d6	Add docs about redundancy (#2142 ) ## Issue Addressed - Resolves #2140 ## Proposed Changes Adds some documentation on the topic of "redundancy". ## Additional Info NA	2021-01-12 00:26:22 +00:00
realbigsean	423dea169c	update smallvec (#2152 ) ## Issue Addressed `cargo audit` is failing because of a potential for an overflow in the version of `smallvec` we're using ## Proposed Changes Update to the latest version of `smallvec`, which has the fix Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-11 23:32:11 +00:00
Arthur Woimbée	851a4dca3c	replace tempdir by tempfile (#2143 ) ## Issue Addressed Fixes #2141 Remove [tempdir](https://docs.rs/tempdir/0.3.7/tempdir/) in favor of [tempfile](https://docs.rs/tempfile/3.1.0/tempfile/). ## Proposed Changes `tempfile` has a slightly different api that makes creating temp folders with a name prefix a chore (`tempdir::TempDir::new("toto")` => `tempfile::Builder::new().prefix("toto").tempdir()`). So I removed temp folder name prefix where I deemed it not useful. Otherwise, the functionality is the same.	2021-01-06 06:36:11 +00:00
Age Manning	7e4b190df0	Reduce ping interval (#2132 ) ## Issue Addressed #2123 ## Description Reduces the TCP ping interval to increase our responsiveness to peer liveness changes.	2021-01-06 04:35:52 +00:00
Paul Hauner	c2eac8e5bd	Remove duplicate log in BN fallback (#2116 ) ## Issue Addressed NA ## Proposed Changes - Removes a duplicated log in the fallback code for the VC. - Updates the text in the remaining de-duped log. ## Additional Info Example ``` Dec 23 05:19:54.003 WARN Beacon node is syncing endpoint: http://xxxx:5052/, head_slot: 88224, sync_distance: 161774 Dec 23 05:19:54.003 WARN Beacon node is not synced endpoint: http://xxxxx:5052/ ```	2021-01-06 03:01:48 +00:00
realbigsean	588b90157d	Ssz state api endpoint (#2111 ) ## Issue Addressed Catching up to a recently merged API spec PR: https://github.com/ethereum/eth2.0-APIs/pull/119 ## Proposed Changes - Return an SSZ beacon state on `/eth/v1/debug/beacon/states/{stateId}` when passed this header: `accept: application/octet-stream`. - requests to this endpoint with no `accept` header or an `accept` header and a value of `application/json` or `/` , or will result in a JSON response ## Additional Info Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-06 03:01:46 +00:00
Samuel E. Moelius	939fa717fd	`test_decode_malicious_status_message` improvements (#2104 ) ## Issue Addressed None ## Proposed Changes * Correct typo in one comment, elaborate some others. * Add asserts to ensure comments match code. * Eliminate one unnecessary `clone`. ## Additional Info None	2021-01-06 01:10:26 +00:00
Samuel E. Moelius	0245ddd37b	Fix typo in `ssz_snappy.rs` comment (#2103 ) ## Issue Addressed None ## Proposed Changes Correct a typo in `ssz_snappy.rs`. ## Additional Info Pedantry at it finest.	2021-01-06 01:10:24 +00:00
Paul Hauner	f183af20e3	Version v1.0.6 (#2126 ) ## Issue Addressed NA ## Proposed Changes - Bump versions - Run `cargo update` ## Additional Info NA	2020-12-28 23:38:02 +00:00
Pawan Dhananjay	32a60578fe	Remove default beacon node value from clap (#2121 ) ## Issue Addressed Fixes #2118 ## Proposed Changes Removes the default value in clap for `--beacon-nodes`. This was causing issues with cli picking `--beacon-nodes` default even when not specified and overriding `--beacon-node`. Seems like it was more evident with docker setups because it doesn't use the default `http://localhost:5052` option. Edit: we already set the default to `http://localhost:5052` here so this shouldn't break any existing setups. `9ed65a64f8/validator_client/src/config.rs (L58)` ## Additional info Tested this with docker-compose and binaries. Works as expected in both cases.	2020-12-28 08:23:59 +00:00
Michael Sproul	43ac3f7209	Fix slasher database schema migration to v2 (#2125 ) ## Issue Addressed Closes #2119 ## Proposed Changes Update the slasher schema version to v2 for the breaking changes to the config introduced in #2079. Implement a migration from v1 to v2 so that users can seamlessly upgrade from any version of Lighthouse <=1.0.5. Users who deleted their database for v1.0.5 can upgrade to a release including this patch without any manual intervention. Similarly, any users still on v1.0.4 or earlier can now upgrade without having to drop their database.	2020-12-28 05:09:19 +00:00
Akihito Nakano	78d17c3255	Tweak error messages for ease of investigation (#2122 ) ## Proposed Changes <!-- Please list or describe the changes introduced by this PR. --> Tweaked the error message for ease of investigation as `Failed to update eth1 cache` is used in multiple places. 😃	2020-12-28 01:25:33 +00:00
Paul Hauner	9ed65a64f8	Version v1.0.5 (#2117 ) ## Issue Addressed NA ## Proposed Changes - Bump versions to `v1.0.5` - Run `cargo update` ## Additional Info NA	2020-12-23 18:52:48 +00:00
Michael Sproul	c5f03f7d56	Tidy slasher logs for known slashings (#2108 ) ## Proposed Changes This quiets the slasher logs when ingesting slashings that are already known. Previously we would log an `ERRO` when a slashing was rediscovered locally but had already been submitted on-chain. This is to be expected from time to time, as different users' slashers will run at different times, and it's likely that slashings will make it on-chain before all users have detected them locally.	2020-12-23 07:53:38 +00:00
Age Manning	2931b05582	Update libp2p (#2101 ) This is a little bit of a tip-of-the-iceberg PR. It houses a lot of code changes in the libp2p dependency. This needs a bit of thorough testing before merging. The primary code changes are: - General libp2p dependency update - Gossipsub refactor to shift compression into gossipsub providing performance improvements and improved API for handling compression Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-12-23 07:53:36 +00:00
realbigsean	b5e81eb6b2	add automated release workflow (#2077 ) ## Issue Addressed Resolves #1674 ## Proposed Changes - Whenever a tag is pushed with the prefix `v` this workflow is triggered - creates portable and non-portable binaries for linux x86_64, linux aarch64, macOS - an attempt at using github actions caching - signs each binary using GPG - auto-generates full changelog based on commit messages since the last release - creates a draft release - hot new formatting (preview [here](https://github.com/realbigsean/lighthouse/releases/tag/v0.9.23)) - has been taking around 35 minutes ## Additional Info TODOs: - Figure out how we should automate dockerhub's version tag. - It'd be quickest just to tag `latest`, but we'd need to make sure the docker workflow completes before this starts - we do the same cross-compile in the `docker` workflow, we could try to use the same binary - integrate a similar flow for unstable binaries (`-rc` tag?) - improve caching, potentially use sccache - if we start using a self-hosted runner this'll require some re-working Need to add the following secrets to Github: - `GPG_PASSPHRASE` - ~~`GPG_PUBLIC_KEY`~~ hard-coded this, because it was tough manage as a secret - `GPG_SIGNING_KEY` Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-23 07:53:34 +00:00
Samuel E. Moelius	3381266998	Eliminate uses of `expect` in `ssz_snappy.rs` (#2105 ) ## Issue Addressed None ## Proposed Changes Eliminate three uses of `expect` in `ssz_snappy.rs`. ## Additional Info None	2020-12-22 02:28:37 +00:00
Pawan Dhananjay	166f617b19	Add docs for `/lighthouse/validators/keystore` api (#2071 ) ## Issue Addressed Resolves #2061 Resolves #2066 ## Proposed Changes Document the `/lighthouse/validators/keystore` validator api method. The newly generated/imported keystore is always added to the key cache from this function call `65dcdc361b/validator_client/src/validator_store.rs (L105-L109)` which eventually invokes `KeyCache::add` here if enabled `65dcdc361b/validator_client/src/initialized_validators.rs (L192)`	2020-12-21 07:43:04 +00:00
Michael Sproul	e5bf2576f1	Optimise tree hash caching for block production (#2106 ) ## Proposed Changes `@potuz` on the Eth R&D Discord observed that Lighthouse blocks on Pyrmont were always arriving at other nodes after at least 1 second. Part of this could be due to processing and slow propagation, but metrics also revealed that the Lighthouse nodes were usually taking 400-600ms to even just produce a block before broadcasting it. I tracked the slowness down to the lack of a pre-built tree hash cache (THC) on the states being used for block production. This was due to using the head state for block production, which lacks a THC in order to keep fork choice fast (cloning a THC takes at least 30ms for 100k validators). This PR modifies block production to clone a state from the snapshot cache rather than the head, which speeds things up by 200-400ms by avoiding the tree hash cache rebuild. In practice this seems to have cut block production time down to 300ms or less. Ideally we could _remove_ the snapshot from the cache (and save the 30ms), but it is required for when we re-process the block after signing it with the validator client. ## Alternatives I experimented with 2 alternatives to this approach, before deciding on it: * Alternative 1: ensure the `head` has a tree hash cache. This is too slow, as it imposes a +30ms hit on fork choice, which currently takes ~5ms (with occasional spikes). * Alternative 2: use `Arc<BeaconSnapshot>` in the snapshot cache and share snapshots between the cache and the `head`. This made fork choice blazing fast (1ms), and block production the same as in this PR, but had a negative impact on block processing which I don't think is worth it. It ended up being necessary to clone the full state from the snapshot cache during block production, imposing the +30ms penalty there _as well_ as in block production. In contract, the approach in this PR should only impact block production, and it improves it! Yay for pareto improvements 🎉 ## Additional Info This commit (ac59dfa) is currently running on all the Lighthouse Pyrmont nodes, and I've added a dashboard to the Pyrmont grafana instance with the metrics. In future work we should optimise the attestation packing, which consumes around 30-60ms and is now a substantial contributor to the total.	2020-12-21 06:29:39 +00:00
Paul Hauner	a62dc65ca4	BN Fallback v2 (#2080 ) ## Issue Addressed - Resolves #1883 ## Proposed Changes This follows on from @blacktemplar's work in #2018. - Allows the VC to connect to multiple BN for redundancy. - Update the simulator so some nodes always need to rely on their fallback. - Adds some extra deprecation warnings for `--eth1-endpoint` - Pass `SignatureBytes` as a reference instead of by value. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2020-12-18 09:17:03 +00:00
Pawan Dhananjay	f998eff7ce	Subnet discovery fixes (#2095 ) ## Issue Addressed N/A ## Proposed Changes Fixes multiple issues related to discovering of subnet peers. 1. Subnet discovery retries after yielding no results 2. Metadata updates if peer send older metadata 3. peerdb stores the peer subscriptions from gossipsub	2020-12-17 00:39:15 +00:00

... 20 21 22 23 24 ...

4959 Commits