lighthouse

Author	SHA1	Message	Date
Akihito Nakano	3d07934ca0	Fix: `end_slot` returns incorrect value (#2138 ) ## Issue Addressed `Epoch::end_slot()` returns incorrect value when the epoch is the last epoch which can be represented by u64. ```rust let slots_per_epoch = 32; // The last epoch which can be represented by u64. let epoch = Epoch::new(u64::max_value() / slots_per_epoch); println!("{}", epoch.end_slot(slots_per_epoch)); // Slot(18446744073709551614) // -> correctly, the result should be `Slot(18446744073709551615)`. ```	2021-01-19 03:50:06 +00:00
Akihito Nakano	a8d040c821	Fix timing issue in obtaining the Fork (#2158 ) ## Issue Addressed Related PR: https://github.com/sigp/lighthouse/pull/2137#issuecomment-754712492 The Fork is required for VC to perform signing. Currently, it is not guaranteed that the Fork has been obtained at the point of the signing as the Fork is obtained at after ForkService starts. We will see the [error](`851a4dca3c/validator_client/src/validator_store.rs (L127)`) if VC could not perform the signing due to the timing issue. > Unable to get Fork for signing ## Proposed Changes Obtain the Fork on `init_from_beacon_node` to fix the timing issue.	2021-01-19 02:54:18 +00:00
realbigsean	908c8eadf3	remove protected environment (#2135 ) ## Issue Addressed N/A ## Proposed Changes Remove Github Action environments ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 01:29:06 +00:00
realbigsean	7a71977987	Clippy 1.49.0 updates and dht persistence test fix (#2156 ) ## Issue Addressed `test_dht_persistence` failing ## Proposed Changes Bind `NetworkService::start` to an underscore prefixed variable rather than `_`. `_` was causing it to be dropped immediately This was failing 5/100 times before this update, but I haven't been able to get it to fail after updating it Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 00:34:28 +00:00
Akihito Nakano	e5b1a37110	[simulator] Fix race condition when creating LocalBeaconNode (#2137 ) ## Issue Addressed We have a race condition when counting the number of beacon nodes. The user could end up seeing a duplicated service name (node_N). ## Proposed Changes I have updated to acquire write lock before counting the number of beacon nodes.	2021-01-14 00:04:18 +00:00
Pawan Dhananjay	28238d97b1	Disconnect from peers quicker on internet issues (#2147 ) ## Issue Addressed Fixes #2146 ## Proposed Changes Change ping timeout errors to return `LowToleranceErrors` so that we disconnect faster on internet failures/changes.	2021-01-13 08:09:10 +00:00
realbigsean	14df5d5c32	Use cross in linux x86 64 release flow (#2136 ) ## Issue Addressed Resolves #2120 ## Proposed Changes This updates github actions to use `cross` when compiling linux x86_64 binaries. ## Additional Info I think we could alternatively be explicit with the version of macOS or ubuntu we are running actions on and that could solve #2120. I'm not sure which method is preferred here though. Github actions supports Ubuntu 16.04 Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-12 06:38:22 +00:00
Paul Hauner	1d535659d6	Add docs about redundancy (#2142 ) ## Issue Addressed - Resolves #2140 ## Proposed Changes Adds some documentation on the topic of "redundancy". ## Additional Info NA	2021-01-12 00:26:22 +00:00
realbigsean	423dea169c	update smallvec (#2152 ) ## Issue Addressed `cargo audit` is failing because of a potential for an overflow in the version of `smallvec` we're using ## Proposed Changes Update to the latest version of `smallvec`, which has the fix Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-11 23:32:11 +00:00
Arthur Woimbée	851a4dca3c	replace tempdir by tempfile (#2143 ) ## Issue Addressed Fixes #2141 Remove [tempdir](https://docs.rs/tempdir/0.3.7/tempdir/) in favor of [tempfile](https://docs.rs/tempfile/3.1.0/tempfile/). ## Proposed Changes `tempfile` has a slightly different api that makes creating temp folders with a name prefix a chore (`tempdir::TempDir::new("toto")` => `tempfile::Builder::new().prefix("toto").tempdir()`). So I removed temp folder name prefix where I deemed it not useful. Otherwise, the functionality is the same.	2021-01-06 06:36:11 +00:00
Age Manning	7e4b190df0	Reduce ping interval (#2132 ) ## Issue Addressed #2123 ## Description Reduces the TCP ping interval to increase our responsiveness to peer liveness changes.	2021-01-06 04:35:52 +00:00
Paul Hauner	c2eac8e5bd	Remove duplicate log in BN fallback (#2116 ) ## Issue Addressed NA ## Proposed Changes - Removes a duplicated log in the fallback code for the VC. - Updates the text in the remaining de-duped log. ## Additional Info Example ``` Dec 23 05:19:54.003 WARN Beacon node is syncing endpoint: http://xxxx:5052/, head_slot: 88224, sync_distance: 161774 Dec 23 05:19:54.003 WARN Beacon node is not synced endpoint: http://xxxxx:5052/ ```	2021-01-06 03:01:48 +00:00
realbigsean	588b90157d	Ssz state api endpoint (#2111 ) ## Issue Addressed Catching up to a recently merged API spec PR: https://github.com/ethereum/eth2.0-APIs/pull/119 ## Proposed Changes - Return an SSZ beacon state on `/eth/v1/debug/beacon/states/{stateId}` when passed this header: `accept: application/octet-stream`. - requests to this endpoint with no `accept` header or an `accept` header and a value of `application/json` or `/` , or will result in a JSON response ## Additional Info Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-06 03:01:46 +00:00
Samuel E. Moelius	939fa717fd	`test_decode_malicious_status_message` improvements (#2104 ) ## Issue Addressed None ## Proposed Changes * Correct typo in one comment, elaborate some others. * Add asserts to ensure comments match code. * Eliminate one unnecessary `clone`. ## Additional Info None	2021-01-06 01:10:26 +00:00
Samuel E. Moelius	0245ddd37b	Fix typo in `ssz_snappy.rs` comment (#2103 ) ## Issue Addressed None ## Proposed Changes Correct a typo in `ssz_snappy.rs`. ## Additional Info Pedantry at it finest.	2021-01-06 01:10:24 +00:00
Paul Hauner	f183af20e3	Version v1.0.6 (#2126 ) ## Issue Addressed NA ## Proposed Changes - Bump versions - Run `cargo update` ## Additional Info NA	2020-12-28 23:38:02 +00:00
Pawan Dhananjay	32a60578fe	Remove default beacon node value from clap (#2121 ) ## Issue Addressed Fixes #2118 ## Proposed Changes Removes the default value in clap for `--beacon-nodes`. This was causing issues with cli picking `--beacon-nodes` default even when not specified and overriding `--beacon-node`. Seems like it was more evident with docker setups because it doesn't use the default `http://localhost:5052` option. Edit: we already set the default to `http://localhost:5052` here so this shouldn't break any existing setups. `9ed65a64f8/validator_client/src/config.rs (L58)` ## Additional info Tested this with docker-compose and binaries. Works as expected in both cases.	2020-12-28 08:23:59 +00:00
Michael Sproul	43ac3f7209	Fix slasher database schema migration to v2 (#2125 ) ## Issue Addressed Closes #2119 ## Proposed Changes Update the slasher schema version to v2 for the breaking changes to the config introduced in #2079. Implement a migration from v1 to v2 so that users can seamlessly upgrade from any version of Lighthouse <=1.0.5. Users who deleted their database for v1.0.5 can upgrade to a release including this patch without any manual intervention. Similarly, any users still on v1.0.4 or earlier can now upgrade without having to drop their database.	2020-12-28 05:09:19 +00:00
Akihito Nakano	78d17c3255	Tweak error messages for ease of investigation (#2122 ) ## Proposed Changes <!-- Please list or describe the changes introduced by this PR. --> Tweaked the error message for ease of investigation as `Failed to update eth1 cache` is used in multiple places. 😃	2020-12-28 01:25:33 +00:00
Paul Hauner	9ed65a64f8	Version v1.0.5 (#2117 ) ## Issue Addressed NA ## Proposed Changes - Bump versions to `v1.0.5` - Run `cargo update` ## Additional Info NA	2020-12-23 18:52:48 +00:00
Michael Sproul	c5f03f7d56	Tidy slasher logs for known slashings (#2108 ) ## Proposed Changes This quiets the slasher logs when ingesting slashings that are already known. Previously we would log an `ERRO` when a slashing was rediscovered locally but had already been submitted on-chain. This is to be expected from time to time, as different users' slashers will run at different times, and it's likely that slashings will make it on-chain before all users have detected them locally.	2020-12-23 07:53:38 +00:00
Age Manning	2931b05582	Update libp2p (#2101 ) This is a little bit of a tip-of-the-iceberg PR. It houses a lot of code changes in the libp2p dependency. This needs a bit of thorough testing before merging. The primary code changes are: - General libp2p dependency update - Gossipsub refactor to shift compression into gossipsub providing performance improvements and improved API for handling compression Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-12-23 07:53:36 +00:00
realbigsean	b5e81eb6b2	add automated release workflow (#2077 ) ## Issue Addressed Resolves #1674 ## Proposed Changes - Whenever a tag is pushed with the prefix `v` this workflow is triggered - creates portable and non-portable binaries for linux x86_64, linux aarch64, macOS - an attempt at using github actions caching - signs each binary using GPG - auto-generates full changelog based on commit messages since the last release - creates a draft release - hot new formatting (preview [here](https://github.com/realbigsean/lighthouse/releases/tag/v0.9.23)) - has been taking around 35 minutes ## Additional Info TODOs: - Figure out how we should automate dockerhub's version tag. - It'd be quickest just to tag `latest`, but we'd need to make sure the docker workflow completes before this starts - we do the same cross-compile in the `docker` workflow, we could try to use the same binary - integrate a similar flow for unstable binaries (`-rc` tag?) - improve caching, potentially use sccache - if we start using a self-hosted runner this'll require some re-working Need to add the following secrets to Github: - `GPG_PASSPHRASE` - ~~`GPG_PUBLIC_KEY`~~ hard-coded this, because it was tough manage as a secret - `GPG_SIGNING_KEY` Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-23 07:53:34 +00:00
Samuel E. Moelius	3381266998	Eliminate uses of `expect` in `ssz_snappy.rs` (#2105 ) ## Issue Addressed None ## Proposed Changes Eliminate three uses of `expect` in `ssz_snappy.rs`. ## Additional Info None	2020-12-22 02:28:37 +00:00
Pawan Dhananjay	166f617b19	Add docs for `/lighthouse/validators/keystore` api (#2071 ) ## Issue Addressed Resolves #2061 Resolves #2066 ## Proposed Changes Document the `/lighthouse/validators/keystore` validator api method. The newly generated/imported keystore is always added to the key cache from this function call `65dcdc361b/validator_client/src/validator_store.rs (L105-L109)` which eventually invokes `KeyCache::add` here if enabled `65dcdc361b/validator_client/src/initialized_validators.rs (L192)`	2020-12-21 07:43:04 +00:00
Michael Sproul	e5bf2576f1	Optimise tree hash caching for block production (#2106 ) ## Proposed Changes `@potuz` on the Eth R&D Discord observed that Lighthouse blocks on Pyrmont were always arriving at other nodes after at least 1 second. Part of this could be due to processing and slow propagation, but metrics also revealed that the Lighthouse nodes were usually taking 400-600ms to even just produce a block before broadcasting it. I tracked the slowness down to the lack of a pre-built tree hash cache (THC) on the states being used for block production. This was due to using the head state for block production, which lacks a THC in order to keep fork choice fast (cloning a THC takes at least 30ms for 100k validators). This PR modifies block production to clone a state from the snapshot cache rather than the head, which speeds things up by 200-400ms by avoiding the tree hash cache rebuild. In practice this seems to have cut block production time down to 300ms or less. Ideally we could _remove_ the snapshot from the cache (and save the 30ms), but it is required for when we re-process the block after signing it with the validator client. ## Alternatives I experimented with 2 alternatives to this approach, before deciding on it: * Alternative 1: ensure the `head` has a tree hash cache. This is too slow, as it imposes a +30ms hit on fork choice, which currently takes ~5ms (with occasional spikes). * Alternative 2: use `Arc<BeaconSnapshot>` in the snapshot cache and share snapshots between the cache and the `head`. This made fork choice blazing fast (1ms), and block production the same as in this PR, but had a negative impact on block processing which I don't think is worth it. It ended up being necessary to clone the full state from the snapshot cache during block production, imposing the +30ms penalty there _as well_ as in block production. In contract, the approach in this PR should only impact block production, and it improves it! Yay for pareto improvements 🎉 ## Additional Info This commit (ac59dfa) is currently running on all the Lighthouse Pyrmont nodes, and I've added a dashboard to the Pyrmont grafana instance with the metrics. In future work we should optimise the attestation packing, which consumes around 30-60ms and is now a substantial contributor to the total.	2020-12-21 06:29:39 +00:00
Paul Hauner	a62dc65ca4	BN Fallback v2 (#2080 ) ## Issue Addressed - Resolves #1883 ## Proposed Changes This follows on from @blacktemplar's work in #2018. - Allows the VC to connect to multiple BN for redundancy. - Update the simulator so some nodes always need to rely on their fallback. - Adds some extra deprecation warnings for `--eth1-endpoint` - Pass `SignatureBytes` as a reference instead of by value. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2020-12-18 09:17:03 +00:00
Pawan Dhananjay	f998eff7ce	Subnet discovery fixes (#2095 ) ## Issue Addressed N/A ## Proposed Changes Fixes multiple issues related to discovering of subnet peers. 1. Subnet discovery retries after yielding no results 2. Metadata updates if peer send older metadata 3. peerdb stores the peer subscriptions from gossipsub	2020-12-17 00:39:15 +00:00
realbigsean	ca08fc7831	Revert "add caching to test suite (#2089 )" (#2098 ) ## Issue Addressed N/A ## Proposed Changes I didn't realize the `PORTABLE` env variable is only picked up by `install` in the `Makefile` so we are still getting `SIGILL`s: https://github.com/sigp/lighthouse/runs/1565004525?check_suite_focus=true ## Additional Info Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-16 23:29:07 +00:00
blacktemplar	3fcc517993	Fix Syncing Simulator (#2049 ) ## Issue Addressed NA ## Proposed Changes Fixes problems with slot times below 1 second which got revealed by running the syncing simulator with the default speedup time.	2020-12-16 05:37:38 +00:00
Michael Sproul	da1c5fe69d	Delete uncompressed genesis states (#2092 ) ## Issue Addressed Replaces #2091 ## Proposed Changes * Delete the uncompressed genesis states from `eth2_network_config` after they were merged accidentally in #2029. * Tweak the build script to not overwrite `genesis.ssz` on every build, which caused spurious rebuilds.	2020-12-16 03:44:05 +00:00
realbigsean	80f47fcfff	add caching to test suite (#2089 ) ## Issue Addressed N/A ## Proposed Changes Add some caching to the test suite and to the aarch64 cross-compile in the docker build. ## Additional Info Cache hits only occur if the Cargo.lock file is unchanged, Github Actions runner OS matches, and the cache is "in scope". Some documentation on github actions cache scoping is here: https://docs.github.com/en/free-pro-team@latest/actions/guides/caching-dependencies-to-speed-up-workflows#matching-a-cache-key I'm not sure how frequently we'll get cache hits, I imagine only on smaller PR's or updates to the same PR. And there is a cache size limit that we may end up reaching quickly. But Github actions handles evictions if we go over that limit. Not sure how much of an impact this will end up having but I don't really see a downside to trying it out. Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-16 03:44:03 +00:00
Michael Sproul	0c529b8d52	Add slasher broadcast (#2079 ) ## Issue Addressed Closes #2048 ## Proposed Changes * Broadcast slashings when the `--slasher-broadcast` flag is provided. * In the process of implementing this I refactored the slasher service into its own crate so that it could access the network code without creating a circular dependency. I moved the responsibility for putting slashings into the op pool into the service as well, as it makes sense for it to handle the whole slashing lifecycle.	2020-12-16 03:44:01 +00:00
Pawan Dhananjay	63eeb14a81	Improve eth1 fallback logging (#2096 ) ## Issue Addressed N/A ## Proposed Changes There seemed to be confusion among discord users on the eth1 fallback logging ``` WARN Error connecting to eth1 node. Trying fallback ..., endpoint: http://127.0.0.1:8545/, service: eth1_rpc ``` The assumption users seem to be making here is that it is trying the fallback and fallback=endpoint in the log. This PR improves the logging to be like ``` WARN Error connecting to eth1 node endpoint, endpoint: http://127.0.0.1:8545/, action: trying fallbacks, service: eth1_rpc ``` I think this is a bit more clear that the endpoint that failed is the one in the log.	2020-12-16 02:39:09 +00:00
divma	11c299cbf6	impl Resource Unavailable RPC error (#2072 ) ## Issue Addressed Related to #1891, The error is not in the spec yet (see ethereum/eth2.0-specs#2131) ## Proposed Changes Implement the proposed error, banning peers that send it ## Additional Info NA	2020-12-15 00:17:32 +00:00
blacktemplar	701843aaa0	Update dependencies (#2084 ) ## Issue Addressed Partially addresses dependencies mentioned in issue #1712. ## Proposed Changes Updates dependencies (including an update avoiding a vulnerability) + add tokio compatibility to `remote_signer_test`	2020-12-14 02:28:19 +00:00
realbigsean	c1e27f4c89	Improve docker auto builds (#2078 ) ## Issue Addressed N/A ## Proposed Changes - hardcode `ubuntu-18.04` -- I don't think this was causing us issues, but github actions is in the process of migrating `ubuntu-latest` from Ubuntu 18 -> 20.. so just in case - different source of emulation dependencies -> https://github.com/tonistiigi/binfmt - this one is explicitly referenced in the `buildx` github docs - install emulation dependencies and run `docker buildx` in the same `run` command - enable `buildx` with `DOCKER_CLI_EXPERIMENTAL: enabled` rather than re-building it ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-11 00:19:35 +00:00
Michael Sproul	1abc70e815	Version v1.0.4 (#2073 ) ## Proposed Changes Run cargo update and bump version in prep for v1.0.4 release ## Additional Info Planning to merge this commit to `unstable`, test on Pyrmont and canary nodes, then push to `stable`.	2020-12-10 04:01:40 +00:00
Age Manning	dfb588e521	Softer penalties for missing blocks (#2075 ) ## Issue Addressed Users are reporting errors for sending attestations to peers. If the clock sync is a little out or we receive attestations before blocks, peers are being too harshly penalized. They can get scored many times per missing block and we typically need these peers on subnets. ## Proposed Changes This removes the penalization for missing blocks with attestations. The penalty should be handled when #635 gets built as it will allow us to group attestations per missing block and penalize once.	2020-12-10 00:40:12 +00:00
realbigsean	adbd49ddc6	Multiarch docker GitHub actions (#2065 ) ## Issue Addressed Resolves #1512 ## Proposed Changes - Adds a new docker Github Actions workflow - Removes the Dockerhub hook - Adds a new Dockerfile for use with pre-existing cross-compiled binaries - on pushes to `unstable` - builds an ARM64 image and tags it `latest-arm64-unstable` - builds an AMD64 image and tags it `latest-amd64-unstable` - builds an multiarch image by creating a manifest list referencing the prior two images and tags it `latest-unstable` - on pushes to `stable` - builds an ARM64 image and tags it `latest-arm64` - builds an AMD64 image and tags it `latest-amd64` - builds an multiarch image by creating a manifest list referencing the prior two images and tags it `latest` ## Additional Info - for ARM64, first `cross` is used to cross compile the `lighthouse` and `lcli` binaries, then `docker buildx` is installed to actually build the docker image for the correct target platform. The image build pretty much just copies the binaries from local into the docker image (thanks @michaelsproul :) ) - The AMD64 and ARM64 builds run in parallel, in total it's been taking around 45mins on a local runner - This PR does not cover version tags on docker images at the moment Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-09 06:06:37 +00:00
Michael Sproul	aa45fa3ff7	Revert fork choice if disk write fails (#2068 ) ## Issue Addressed Closes #2028 Replaces #2059 ## Proposed Changes If writing to the database fails while importing a block, revert fork choice to the last version stored on disk. This prevents fork choice from being ahead of the blocks on disk. Having fork choice ahead is particularly bad if it is later successfully written to disk, because it renders the database corrupt (see #2028). ## Additional Info * This mitigation might fail if the head+fork choice haven't been persisted yet, which can only happen at first startup (see #2067) * This relies on it being OK for the head tracker to be ahead of fork choice. I figure this is tolerable because blocks only get added to the head tracker after successfully being written on disk _and_ to fork choice, so even if fork choice reverts a little bit, when the pruning algorithm runs, those blocks will still be on disk and OK to prune. The pruning algorithm also doesn't rely on heads being unique, technically it's OK for multiple blocks from the same linear chain segment to be present in the head tracker. This begs the question of #1785 (i.e. things would be simpler with the head tracker out of the way). Alternatively, this PR could just revert the head tracker as well (I'll look into this tomorrow).	2020-12-09 05:10:34 +00:00
Michael Sproul	82753f842d	Improve compile time (#1989 ) ## Issue Addressed Closes #1264 ## Proposed Changes * Milagro BLS: tweak the feature flags so that Milagro doesn't get compiled if we're using BLST. Profiling showed that it was consuming about 1 minute of CPU time out of 60 minutes of CPU time (real time ~15 mins). A 1.6% saving. * Reduce monomorphization: compiling for 3 different `EthSpec` types causes a heck of a lot of generic functions to be instantiated (monomorphized). Removing 2 of 3 cuts the LLVM+linking step from around 250 seconds to 180 seconds, a saving of 70 seconds (real time!). This applies only to `make` and not the CI build, because we test with the minimal spec on CI. * Update `web3` crate to v0.13. This is perhaps the most controversial change, because it requires axing some deposit contract tools from `lcli`. I suspect these tools weren't used much anyway, and could be maintained separately, but I'm also happy to revert this change. However, it does save us a lot of compile time. With #1839, we now have 3 versions of Tokio (and all of Tokio's deps). This change brings us down to 2 versions, but 1 should be achievable once web3 (and reqwest) move to Tokio 0.3. * Remove `lcli` from the Docker image. It's a dev tool and can be built from the repo if required.	2020-12-09 01:34:58 +00:00
Age Manning	4f85371ce8	Downgrades a valid log (#2057 ) ## Issue Addressed #2046 ## Proposed Changes The log was originally intended to verify the correct logic and ordering of events when scoring peers. The queued tasks can be structured in such a way that peers can be banned after they are disconnected. Therefore the error log is now downgraded to debug log.	2020-12-08 10:48:45 +00:00
divma	57489e620f	fix default network handling (#2029 ) ## Issue Addressed #1992 and #1987, and also to be considered a continuation of #1751 ## Proposed Changes many changed files but most are renaming to align the code with the semantics of `--network` - remove the `--network` default value (in clap) and instead set it after checking the `network` and `testnet-dir` flags - move `eth2_testnet_config` crate to `eth2_network_config` - move `Eth2TestnetConfig` to `Eth2NetworkConfig` - move `DEFAULT_HARDCODED_TESTNET` to `DEFAULT_HARDCODED_NETWORK` - `beacon_node`s `get_eth2_testnet_config` loads the `DEFAULT_HARDCODED_NETWORK` if there is no network nor testnet provided - `boot_node`s config loads the config same as the `beacon_node`, it was using the configuration only for preconfigured networks (That code is ~1year old so I asume it was not intended) - removed a one year old comment stating we should try to emulate `https://github.com/eth2-clients/eth2-testnets/tree/master/nimbus/testnet1` it looks outdated (?) - remove `lighthouse`s `load_testnet_config` in favor of `get_eth2_network_config` to centralize that logic (It had differences) - some spelling ## Additional Info Both the command of #1992 and the scripts of #1987 seem to work fine, same as `bn` and `vc`	2020-12-08 05:41:10 +00:00
divma	f3200784b4	More metrics + RPC tweaks (#2041 ) ## Issue Addressed NA ## Proposed Changes This was mostly done to find the reason why LH was dropping peers from Nimbus. It proved to be useful so I think it's worth it. But there is also some functional stuff here - Add metrics for rpc errors per client, error type and direction - Add metrics for downscoring events per source type, client and penalty type - Add metrics for gossip validation results per client for non-accepted messages - Make the RPC handler return errors and requests/responses in the order we see them - Allow a small burst for the Ping rate limit, from 1 every 5 seconds to 2 every 10 seconds - Send rate limiting errors with a particular code and use that same code to identify them. I picked something different to 128 since that is most likely what other clients are using for their own errors - Remove some unused code in the `PeerAction` and the rpc handler - Remove the unused variant `RateLimited`. tTis was never produced directly, since the only way to get the request's protocol is via de handler. The handler upon receiving from LH a response with an error (rate limited in this case) emits this event with the missing info (It was always like this, just pointing out that we do downscore rate limiting errors regardless of the change) Metrics for Nimbus looked like this: Downscoring events: `increase(libp2p_peer_actions_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210880-862bf280-3676-11eb-94c0-399f0bf5aa2e.png) RPC Errors: `increase(libp2p_rpc_errors_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101210997-ba071800-3676-11eb-847a-f32405ede002.png) Unaccepted gossip message: `increase(gossipsub_unaccepted_messages_per_client{client="Nimbus"}[5m])` ![image](https://user-images.githubusercontent.com/26765164/101211124-f470b500-3676-11eb-9459-132ecff058ec.png)	2020-12-08 03:55:50 +00:00
blacktemplar	a28e8decbf	update dependencies (#2032 ) ## Issue Addressed NA ## Proposed Changes Updates out of date dependencies. ## Additional Info See also https://github.com/sigp/lighthouse/issues/1712 for a list of dependencies that are still out of date and the resasons.	2020-12-07 08:20:33 +00:00
realbigsean	9c915349d4	Remove audit ignore ws server (#2051 ) ## Issue Addressed Closes #1669 ## Proposed Changes Remove cargo audit ignore for ws server related vuln now that the ws server has been removed ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-12-06 23:35:51 +00:00
Rémy Roy	0f5f3b522e	Fix default values and --network flag in Voluntary exits book page (#2056 ) ## Issue Addressed None yet reported. ## Proposed Changes Fix the old flag in the Voluntary exits book page to use the new `--network` flag. Also fix the default value for that flag.	2020-12-06 22:16:05 +00:00
Michael Sproul	c1ec386d18	Pass failed gossip blocks to the slasher (#2047 ) ## Issue Addressed Closes #2042 ## Proposed Changes Pass blocks that fail gossip verification to the slasher. Blocks that are successfully verified are not passed immediately, but will be passed as part of full block verification.	2020-12-04 05:03:30 +00:00
Pawan Dhananjay	7933596c89	Add a purge-eth1-cache cli option (#2039 ) ## Issue Some eth1 clients are missing deposit logs on mainnet for multiple reasons (not fully synced, eth1 client issues) because of which we are getting `FailedToInsertDeposit` errors. Ideally, LH should pick up where it left off after pointing it to a nice eth1 client endpoint (which has all deposits). However, I have seen instances where LH keeps getting `FailedToInsertDeposit` even after switching to a good endpoint. Only deleting the beacon directory (which also wipes the eth1 cache) and resyncing the eth1 caches seems to be the solution. This wouldn't be great for mainnet if you have to sync your beacon node again as well. ## Proposed Changes Add a `--purge-eth1-db` option which just wipes the eth1 cache and doesn't touch the rest of the beacon db. Still need to investigate if and why LH isn't picking up where it left off for the deposit logs sync, but I think it would be good to have an option to just delete eth1 caches regardless.	2020-12-04 05:03:28 +00:00

1 2 3 4 5 ...

3887 Commits