lighthouse

Author	SHA1	Message	Date
Pawan Dhananjay	95a362213d	Fix local testnet scripts (#2229 ) ## Issue Addressed Resolves #2094 ## Proposed Changes Fixes scripts for creating local testnets. Adds an option in `lighthouse boot_node` to run with a previously generated enr.	2021-03-30 05:17:58 +00:00
Paul Hauner	9eb1945136	v1.2.2 (#2287 ) ## Issue Addressed NA ## Proposed Changes - Bump versions ## Additional Info NA	2021-03-30 04:07:03 +00:00
Paul Hauner	3d239b85ac	Allow for a clock disparity on the duties endpoints (#2283 ) ## Issue Addressed Resolves #2280 ## Proposed Changes Allows for API consumers to call the proposer/attester duties endpoints [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](`b34a79dc0b/beacon_node/beacon_chain/src/beacon_chain.rs (L99-L102)`) earlier than the current epoch. For additional reasoning, see https://github.com/sigp/lighthouse/issues/2280#issuecomment-805358897. ## Additional Info NA	2021-03-29 23:42:35 +00:00
Paul Hauner	03cefd0065	Expand observed attestations capacity (#2266 ) ## Issue Addressed NA ## Proposed Changes I noticed the following error on one of our nodes: ``` Mar 18 00:03:35 ip-xxxx lighthouse-bn[333503]: Mar 18 00:03:35.103 ERRO Unable to validate aggregate error: ObservedAttestersError(EpochTooLow { epoch: Epoch(23961), lowest_permissible_epoch: Epoch(23962) }), peer_id: 16Uiu2HAm5GL5KzPLhvfg9MBBFSpBqTVGRFSiTg285oezzWcZzwEv ``` The slot during this log was 766,815 (the last slot of the epoch). I believe this is due to an off-by-one error in `observed_attesters` where we were failing to provide enough capacity to store observations from the previous, current and next epochs. See code comments for further reasoning. Here's a link to the spec: https://github.com/ethereum/eth2.0-specs/blob/v1.0.1/specs/phase0/p2p-interface.md#beacon_aggregate_and_proof ## Additional Info NA	2021-03-29 23:42:34 +00:00
Michael Sproul	f9d60f5436	VC: accept unknown fields in chain spec (#2277 ) ## Issue Addressed Closes #2274 ## Proposed Changes * Modify the `YamlConfig` to collect unknown fields into an `extra_fields` map, instead of failing hard. * Log a debug message if there are extra fields returned to the VC from one of its BNs. This restores Lighthouse's compatibility with Teku beacon nodes (and therefore Infura)	2021-03-26 04:53:57 +00:00
Paul Hauner	b34a79dc0b	v1.2.1 (#2263 ) ## Issue Addressed NA ## Proposed Changes - Bump version. - Add some new ENR for Prater - Afri: https://github.com/eth2-clients/eth2-testnets/pull/42 - Prysm: https://github.com/eth2-clients/eth2-testnets/pull/43 - Apply the fixes from #2181 to the no-eth1-sim to try fix CI issues. ## Additional Info NA	2021-03-18 04:20:46 +00:00
Paul Hauner	015ab7d0a7	Optimize validator duties (#2243 ) ## Issue Addressed Closes #2052 ## Proposed Changes - Refactor the attester/proposer duties endpoints in the BN - Performance improvements - Fixes some potential inconsistencies with the dependent root fields. - Removes `http_api::beacon_proposer_cache` and just uses the one on the `BeaconChain` instead. - Move the code for the proposer/attester duties endpoints into separate files, for readability. - Refactor the `DutiesService` in the VC - Required to reduce the delay on broadcasting new blocks. - Gets rid of the `ValidatorDuty` shim struct that came about when we adopted the standard API. - Separate block/attestation duty tasks so that they don't block each other when one is slow. - In the VC, use `PublicKeyBytes` to represent validators instead of `PublicKey`. `PublicKey` is a legit crypto object whilst `PublicKeyBytes` is just a byte-array, it's much faster to clone/hash `PublicKeyBytes` and this change has had a significant impact on runtimes. - Unfortunately this has created lots of dust changes. - In the BN, store `PublicKeyBytes` in the `beacon_proposer_cache` and allow access to them. The HTTP API always sends `PublicKeyBytes` over the wire and the conversion from `PublicKey` -> `PublickeyBytes` is non-trivial, especially when queries have 100s/1000s of validators (like Pyrmont). - Add the `state_processing::state_advance` mod which dedups a lot of the "apply `n` skip slots to the state" code. - This also fixes a bug with some functions which were failing to include a state root as per [this comment](`072695284f/consensus/state_processing/src/state_advance.rs (L69-L74)`). I couldn't find any instance of this bug that resulted in anything more severe than keying a shuffling cache by the wrong block root. - Swap the VC block service to use `mpsc` from `tokio` instead of `futures`. This is consistent with the rest of the code base. ~~This PR reduces the size of the codebase 🎉~~ It used to reduce the size of the code base before I added more comments. ## Observations on Prymont - Proposer duties times down from peaks of 450ms to consistent <1ms. - Current epoch attester duties times down from >1s peaks to a consistent 20-30ms. - Block production down from +600ms to 100-200ms. ## Additional Info - ~~Blocked on #2241~~ - ~~Blocked on #2234~~ ## TODO - [x] ~~Refactor this into some smaller PRs?~~ Leaving this as-is for now. - [x] Address `per_slot_processing` roots. - [x] Investigate slow next epoch times. Not getting added to cache on block processing? - [x] Consider [this](`072695284f/beacon_node/store/src/hot_cold_store.rs (L811-L812)`) in the scenario of replacing the state roots Co-authored-by: pawan <pawandhananjay@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-03-17 05:09:57 +00:00
Michael Sproul	3919737978	Release v1.2.0 (#2249 ) ## Proposed Changes Release v1.2.0 unchanged from the release candidate.	2021-03-10 01:28:32 +00:00
Michael Sproul	770a2ca030	Fix proposer cache priming upon state advance (#2252 ) ## Proposed Changes While investigating an incorrect head + target vote for the epoch boundary block 708544, I noticed that the state advance failed to prime the proposer cache, as per these logs: ``` Mar 09 21:42:47.448 DEBG Subscribing to subnet target_slot: 708544, subnet: Y, service: attestation_service Mar 09 21:49:08.063 DEBG Advanced head state one slot current_slot: 708543, state_slot: 708544, head_root: 0xaf5e69de09f384ee3b4fb501458b7000c53bb6758a48817894ec3d2b030e3e6f, service: state_advance Mar 09 21:49:08.063 DEBG Completed state advance initial_slot: 708543, advanced_slot: 708544, head_root: 0xaf5e69de09f384ee3b4fb501458b7000c53bb6758a48817894ec3d2b030e3e6f, service: state_advance Mar 09 21:49:14.787 DEBG Proposer shuffling cache miss block_slot: 708544, block_root: 0x9b14bf68667ab1d9c35e6fd2c95ff5d609aa9e8cf08e0071988ae4aa00b9f9fe, parent_slot: 708543, parent_root: 0xaf5e69de09f384ee3b4fb501458b7000c53bb6758a48817894ec3d2b030e3e6f, service: beacon Mar 09 21:49:14.800 DEBG Successfully processed gossip block root: 0x9b14bf68667ab1d9c35e6fd2c95ff5d609aa9e8cf08e0071988ae4aa00b9f9fe, slot: 708544, graffiti: , service: beacon Mar 09 21:49:14.800 INFO New block received hash: 0x9b14…f9fe, slot: 708544 Mar 09 21:49:14.984 DEBG Head beacon block slot: 708544, root: 0x9b14…f9fe, finalized_epoch: 22140, finalized_root: 0x28ec…29a7, justified_epoch: 22141, justified_root: 0x59db…e451, service: beacon Mar 09 21:49:15.055 INFO Unaggregated attestation validator: XXXXX, src: api, slot: 708544, epoch: 22142, delay_ms: 53, index: Y, head: 0xaf5e69de09f384ee3b4fb501458b7000c53bb6758a48817894ec3d2b030e3e6f, service: val_mon Mar 09 21:49:17.001 DEBG Slot timer sync_state: Synced, current_slot: 708544, head_slot: 708544, head_block: 0x9b14…f9fe, finalized_epoch: 22140, finalized_root: 0x28ec…29a7, peers: 55, service: slot_notifier ``` The reason for this is that the condition was backwards, so that whole block of code was unreachable. Looking at the attestations for the block included in the block after, we can see that lots of validators missed it. Some of them may be Lighthouse v1.1.1-v1.2.0-rc.0, but it's probable that they would have missed even with the proposer cache primed, given how late the block 708544 arrived (the cache miss occurred 3.787s after the slot start): https://beaconcha.in/block/708545#attestations	2021-03-10 00:20:50 +00:00
Michael Sproul	786e25ea08	Release candidate v1.2.0-rc.0 (#2248 ) Prepare for v1.2.0 with this release candidate. To be merged after #2247 and #2246 Co-authored-by: Age Manning <Age@AgeManning.com>	2021-03-08 06:27:50 +00:00
Age Manning	babd153352	Prevent adding and dialing bootnodes when discovery is disabled (#2247 ) This is a small PR which prevents unwanted bootnodes from being added to the DHT and being dialed when the `--disable-discovery` flag is set. The main reason one would want to disable discovery is to connect to a fix set of peers. Currently, regardless of what the user does, Lighthouse will populate its DHT with previously known peers and also fill it with the spec's bootnodes. It will then dial the bootnodes that are capable of being dialed. This prevents testing with a fixed peer list. This PR prevents these excess nodes from being added and dialed if the user has set `--disable-discovery`.	2021-03-08 06:27:49 +00:00
Paul Hauner	e4eb0eb168	Use advanced state for block production (#2241 ) ## Issue Addressed NA ## Proposed Changes - Use the pre-states from #2174 during block production. - Running this on Pyrmont shows block production times dropping from ~550ms to ~150ms. - Create `crit` and `warn` logs when a block is published to the API later than we expect. - On mainnet we are issuing a warn if the block is published more than 1s later than the slot start and a crit for more than 3s. - Rename some methods on the `SnapshotCache` for clarity. - Add the ability to pass the state root to `BeaconChain::produce_block_on_state` to avoid computing a state root. This is a very common LH optimization. - Add a metric that tracks how late we broadcast blocks received from the HTTP API. This is technically a duplicate of a `ValidatorMonitor` log, but I wanted to have it for the case where we aren't monitoring validators too.	2021-03-04 04:43:31 +00:00
Michael Sproul	363f15f362	Use the database to persist the pubkey cache (#2234 ) ## Issue Addressed Closes #1787 ## Proposed Changes * Abstract the `ValidatorPubkeyCache` over a "backing" which is either a file (legacy), or the database. * Implement a migration from schema v2 to schema v3, whereby the contents of the cache file are copied to the DB, and then the file is deleted. The next release to include this change must be a minor version bump, and we will need to warn users of the inability to downgrade (this is our first DB schema change since mainnet genesis). * Move the schema migration code from the `store` crate into the `beacon_chain` crate so that it can access the datadir and the `ValidatorPubkeyCache`, etc. It gets injected back into the `store` via a closure (similar to what we do in fork choice).	2021-03-04 01:25:12 +00:00
Age Manning	1c507c588e	Update to the latest libp2p (#2239 ) Updates to the latest libp2p and ignores RUSTSEC-2020-0146 from cargo-audit Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-03-02 05:59:49 +00:00
realbigsean	ed9b245de0	update tokio-stream to 0.1.3 and use `BroadcastStream` (#2212 ) ## Issue Addressed Resolves #2189 ## Proposed Changes use tokio's `BroadcastStream` ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-03-01 01:58:05 +00:00
Michael Sproul	2f077b11fe	Allow HTTP API to return SSZ blocks (#2209 ) ## Issue Addressed Implements https://github.com/ethereum/eth2.0-APIs/pull/125 ## Proposed Changes Optionally return SSZ bytes from the `beacon/blocks` endpoint.	2021-02-24 04:15:14 +00:00
realbigsean	5bc93869c8	Update ValidatorStatus to match the v1 API (#2149 ) ## Issue Addressed N/A ## Proposed Changes We are currently a bit off of the standard API spec because we have [this](https://hackmd.io/bQxMDRt1RbS1TLno8K4NPg?view) proposal implemented for validator status. Based on discussion [here](https://github.com/ethereum/eth2.0-APIs/pull/94), it looks like this won't be added to the spec until v2, so this PR implements [this](https://hackmd.io/ofFJ5gOmQpu1jjHilHbdQQ) validator status logic instead ## Additional Info N/A Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-02-24 04:15:13 +00:00
Paul Hauner	a764c3b247	Handle early blocks (#2155 ) ## Issue Addressed NA ## Problem this PR addresses There's an issue where Lighthouse is banning a lot of peers due to the following sequence of events: 1. Gossip block 0xabc arrives ~200ms early - It is propagated across the network, with respect to [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#why-is-there-maximum_gossip_clock_disparity-when-validating-slot-ranges-of-messages-in-gossip-subnets). - However, it is not imported to our database since the block is early. 2. Attestations for 0xabc arrive, but the block was not imported. - The peer that sent the attestation is down-voted. - Each unknown-block attestation causes a score loss of 1, the peer is banned at -100. - When the peer is on an attestation subnet there can be hundreds of attestations, so the peer is banned quickly (before the missed block can be obtained via rpc). ## Potential solutions I can think of three solutions to this: 1. Wait for attestation-queuing (#635) to arrive and solve this. - Easy - Not immediate fix. - Whilst this would work, I don't think it's a perfect solution for this particular issue, rather (3) is better. 1. Allow importing blocks with a tolerance of `MAXIMUM_GOSSIP_CLOCK_DISPARITY`. - Easy - ~~I have implemented this, for now.~~ 1. If a block is verified for gossip propagation (i.e., signature verified) and it's within `MAXIMUM_GOSSIP_CLOCK_DISPARITY`, then queue it to be processed at the start of the appropriate slot. - More difficult - Feels like the best solution, I will try to implement this. This PR takes approach (3). ## Changes included - Implement the `block_delay_queue`, based upon a [`DelayQueue`](https://docs.rs/tokio-util/0.6.3/tokio_util/time/delay_queue/struct.DelayQueue.html) which can store blocks until it's time to import them. - Add a new `DelayedImportBlock` variant to the `beacon_processor::WorkEvent` enum to handle this new event. - In the `BeaconProcessor`, refactor a `tokio::select!` to a struct with an explicit `Stream` implementation. I experienced some issues with `tokio::select!` in the block delay queue and I also found it hard to debug. I think this explicit implementation is nicer and functionally equivalent (apart from the fact that `tokio::select!` randomly chooses futures to poll, whereas now we're deterministic). - Add a testing framework to the `beacon_processor` module that tests this new block delay logic. I also tested a handful of other operations in the beacon processor (attns, slashings, exits) since it was super easy to copy-pasta the code from the `http_api` tester. - To implement these tests I added the concept of an optional `work_journal_tx` to the `BeaconProcessor` which will spit out a log of events. I used this in the tests to ensure that things were happening as I expect. - The tests are a little racey, but it's hard to avoid that when testing timing-based code. If we see CI failures I can revise. I haven't observed any failures due to races on my machine or on CI yet. - To assist with testing I allowed for directly setting the time on the `ManualSlotClock`. - I gave the `beacon_processor::Worker` a `Toolbox` for two reasons; (a) it avoids changing tons of function sigs when you want to pass a new object to the worker and (b) it seemed cute.	2021-02-24 03:08:52 +00:00
Paul Hauner	46920a84e8	v1.1.3 (#2217 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2021-02-22 06:21:38 +00:00
Paul Hauner	4362ea4f98	Fix false positive "State advance too slow" logs (#2218 ) ## Issue Addressed - Resolves #2214 ## Proposed Changes Fix the false positive warning log described in #2214. ## Additional Info NA	2021-02-21 23:47:53 +00:00
Paul Hauner	8949ae7c4e	Address ENR update loop (#2216 ) ## Issue Addressed - Resolves #2215 ## Proposed Changes Addresses a potential loop when the majority of peers indicate that we are contactable via an IPv6 address. See https://github.com/sigp/discv5/pull/62 for further rationale. ## Additional Info The alternative to this PR is to use `--disable-enr-auto-update` and then manually supply an `--enr-address` and `--enr-upd-port`. However, that requires the user to know their IP addresses in order for discovery to work properly. This might not be practical/achievable for some users, hence this hotfix.	2021-02-21 23:47:52 +00:00
Paul Hauner	8c6537e71d	v1.1.2 (#2213 ) ## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA	2021-02-19 00:49:32 +00:00
Paul Hauner	f8cc82f2b1	Switch back to warp with cors wildcard support (#2211 ) ## Issue Addressed - Resolves #2204 - Resolves #2205 ## Proposed Changes Switches to my fork of `warp` which contains support for cors wildcards: https://github.com/paulhauner/warp/tree/cors-wildcard I have a PR open on the `warp` repo but it hasn't had any interest from the maintainers as of yet: https://github.com/seanmonstar/warp/pull/726. I think running from a fork is the best we can do for now. ## Additional Info NA	2021-02-18 22:33:12 +00:00
Lion - dapplion	613382f304	Add slot offset computing to be downloaded slot (#2198 ) The current implementation assumes the range offset of slots downloaded on a batch to equal zero. This conflicts with the condition to consider this chain as sync. For finalized sync, it results in one extra batch being downloaded which can't be processed. CC @wemeetagain	2021-02-18 08:24:46 +00:00
Paul Hauner	f819ba5414	v1.1.1 (#2202 ) ## Issue Addressed NA ## Proposed Changes Bump versions	2021-02-16 00:09:02 +00:00
Pawan Dhananjay	4a357c9947	Upgrade rand_core (#2201 ) ## Issue Addressed N/A ## Proposed Changes Upgrade `rand_core` to latest version to fix https://rustsec.org/advisories/RUSTSEC-2021-0023	2021-02-15 20:34:49 +00:00
Paul Hauner	88cc222204	Advance state to next slot after importing block (#2174 ) ## Issue Addressed NA ## Proposed Changes Add an optimization to perform `per_slot_processing` from the leading-edge of block processing to the trailing-edge. Ultimately, this allows us to import the block at slot `n` faster because we used the tail-end of slot `n - 1` to perform `per_slot_processing`. Additionally, add a "block proposer cache" which allows us to cache the block proposer for some epoch. Since we're now doing trailing-edge `per_slot_processing`, we can prime this cache with the values for the next epoch before those blocks arrive (assuming those blocks don't have some weird forking). There were several ancillary changes required to achieve this: - Remove the `state_root` field of `BeaconSnapshot`, since there's no need to know it on a `pre_state` and in all other cases we can just read it from `block.state_root()`. - This caused some "dust" changes of `snapshot.beacon_state_root` to `snapshot.beacon_state_root()`, where the `BeaconSnapshot::beacon_state_root()` func just reads the state root from the block. - Rename `types::ShuffingId` to `AttestationShufflingId`. I originally did this because I added a `ProposerShufflingId` struct which turned out to be not so useful. I thought this new name was more descriptive so I kept it. - Address https://github.com/ethereum/eth2.0-specs/pull/2196 - Add a debug log when we get a block with an unknown parent. There was previously no logging around this case. - Add a function to `BeaconState` to compute all proposers for an epoch without re-computing the active indices for each slot. ## Additional Info - ~~Blocked on #2173~~ - ~~Blocked on #2179~~ That PR was wrapped into this PR. - There's potentially some places where we could avoid computing the proposer indices in `per_block_processing` but I haven't done this here. These would be an optimization beyond the issue at hand (improving block propagation times) and I think this PR is already doing enough. We can come back for that later. ## TODO - [x] Tidy, improve comments. - [x] ~~Try avoid computing proposer index in `per_block_processing`?~~	2021-02-15 07:17:52 +00:00
Paul Hauner	3000f3e5da	Dht persistence on drop (v2) (#2200 ) ## Issue Addressed NA ## Proposed Changes This is simply #2177 with a merge conflict fixed. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-02-15 06:09:55 +00:00
Paul Hauner	8e5c20b6d1	Update for clippy 1.50 (#2193 ) ## Issue Addressed NA ## Proposed Changes Rust 1.50 has landed 🎉 The shiny new `clippy` peers down upon us mere mortals with disgust. Brutish peasants wrapping our `usize`s in superfluous `Option`s... tsk tsk. I've performed the goat sacrifice and corrected our evil ways in this PR. Tonight we shall pray that Github Actions bestows the almighty green tick upon us. ## Additional Info NA Co-authored-by: realbigsean <seananderson33@gmail.com> Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2021-02-15 00:09:12 +00:00
realbigsean	e20f64b21a	Update to tokio 1.1 (#2172 ) ## Issue Addressed resolves #2129 resolves #2099 addresses some of #1712 unblocks #2076 unblocks #2153 ## Proposed Changes - Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. - Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1. - We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. Edit: I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58 ## Additional Info tokio 1.0 changes that required some changes in lighthouse: - `interval.next().await.is_some()` -> `interval.tick().await` - `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028 - `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350 - stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870 - I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384 Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-02-10 23:29:49 +00:00
Paul Hauner	e383ef3e91	Avoid temp allocations with slog (#2183 ) ## Issue Addressed Which issue # does this PR address? ## Proposed Changes Replaces use of `format!` in `slog` logging with it's special no-allocation `?` and `%` shortcuts. According to a `heaptrack` analysis today over about a period of an hour, this will reduce temporary allocations by at least 4%. ## Additional Info NA	2021-02-04 07:31:47 +00:00
Paul Hauner	ff35fbb121	Add metrics for beacon block propagation (#2173 ) ## Issue Addressed NA ## Proposed Changes Adds some metrics to track delays regarding: - LH processing of blocks - delays receiving blocks from other nodes. ## Additional Info NA	2021-02-04 05:33:56 +00:00
Akihito Nakano	1a22a096c6	Fix clippy errors on tests (#2160 ) ## Issue Addressed There are some clippy error on tests. ## Proposed Changes Enable clippy check on tests and fix the errors. 💪	2021-01-28 23:31:06 +00:00
Paul Hauner	e4b62139d7	v1.1.0 (#2168 ) ## Issue Addressed NA ## Proposed Changes - Bump version - ~~Run `cargo update`~~ ## Additional Info NA	2021-01-21 02:37:08 +00:00
Paul Hauner	2b2a358522	Detailed validator monitoring (#2151 ) ## Issue Addressed - Resolves #2064 ## Proposed Changes Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators. Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint. Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored. See the new docs included in this PR for more info. ## TODO - [x] Track validator balance, `slashed` status, etc. - [x] ~~Register slashings in current epoch, not offense epoch~~ - [ ] Publish Grafana dashboard, update TODO link in docs - [x] ~~#2130 is merged into this branch, resolve that~~	2021-01-20 19:19:38 +00:00
Paul Hauner	1eb0915301	Fix bug from #2163 (#2165 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug that I missed during a review in #2163. I found this bug by observing that nodes were receiving far less attestations (~1/2 of previous). I'm not certain on exactly how this mistake manifested in a reduction in attestations, but the mistake touches so much code that I think it's reasonable to declare that this it the cause of the observed issue (drop in attestations). ## Additional Info NA	2021-01-20 10:28:12 +00:00
Paul Hauner	b06559ae97	Disallow attestation production earlier than head (#2130 ) ## Issue Addressed The non-finality period on Pyrmont between epochs [`9114`](https://pyrmont.beaconcha.in/epoch/9114) and [`9182`](https://pyrmont.beaconcha.in/epoch/9182) was contributed to by all the `lighthouse_team` validators going down. The nodes saw excessive CPU and RAM usage, resulting in the system to kill the `lighthouse bn` process. The `Restart=on-failure` directive for `systemd` caused the process to bounce in ~10-30m intervals. Diagnosis with `heaptrack` showed that the `BeaconChain::produce_unaggregated_attestation` function was calling `store::beacon_state::get_full_state` and sometimes resulting in a tree hash cache allocation. These allocations were approximately the size of the hosts physical memory and still allocated when `lighthouse bn` was killed by the OS. There was no CPU analysis (e.g., `perf`), but the `BeaconChain::produce_unaggregated_attestation` is very CPU-heavy so it is reasonable to assume it is the cause of the excessive CPU usage, too. ## Proposed Changes `BeaconChain::produce_unaggregated_attestation` has two paths: 1. Fast path: attesting to the head slot or later. 2. Slow path: attesting to a slot earlier than the head block. Path (2) is the only path that calls `store::beacon_state::get_full_state`, therefore it is the path causing this excessive CPU/RAM usage. This PR removes the current functionality of path (2) and replaces it with a static error (`BeaconChainError::AttestingPriorToHead`). This change reduces the generality of `BeaconChain::produce_unaggregated_attestation` (and therefore [`/eth/v1/validator/attestation_data`](https://ethereum.github.io/eth2.0-APIs/#/Validator/produceAttestationData)), but I argue that this functionality is an edge-case and arguably a violation of the [Honest Validator spec](https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/validator.md). It's possible that a validator goes back to a prior slot to "catch up" and submit some missed attestations. This change would prevent such behaviour, returning an error. My concerns with this catch-up behaviour is that it is: - Not specified as "honest validator" attesting behaviour. - Is behaviour that is risky for slashing (although, all validator clients should have slashing protection and will eventually fail if they do not). - It disguises clock-sync issues between a BN and VC. ## Additional Info It's likely feasible to implement path (2) if we implement some sort of caching mechanism. This would be a multi-week task and this PR gets the issue patched in the short term. I haven't created an issue to add path (2), instead I think we should implement it if we get user-demand.	2021-01-20 06:52:37 +00:00
Paul Hauner	d9f940613f	Represent slots in secs instead of millisecs (#2163 ) ## Issue Addressed NA ## Proposed Changes Copied from #2083, changes the config milliseconds_per_slot to seconds_per_slot to avoid errors when slot duration is not a multiple of a second. To avoid deserializing old serialized data (with milliseconds instead of seconds) the Serialize and Deserialize derive got removed from the Spec struct (isn't currently used anyway). This PR replaces #2083 for the purpose of fixing a merge conflict without requiring the input of @blacktemplar. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2021-01-19 09:39:51 +00:00
Paul Hauner	805e152f66	Simplify enum -> str with strum (#2164 ) ## Issue Addressed NA ## Proposed Changes As per #2100, uses derives from the sturm library to implement AsRef<str> and AsStaticRef to easily get str values from enums without creating new Strings. Furthermore unifies all attestation error counter into one IntCounterVec vector. These works are originally by @blacktemplar, I've just created this PR so I can resolve some merge conflicts. ## Additional Info NA Co-authored-by: blacktemplar <blacktemplar@a1.net>	2021-01-19 06:33:58 +00:00
realbigsean	7a71977987	Clippy 1.49.0 updates and dht persistence test fix (#2156 ) ## Issue Addressed `test_dht_persistence` failing ## Proposed Changes Bind `NetworkService::start` to an underscore prefixed variable rather than `_`. `_` was causing it to be dropped immediately This was failing 5/100 times before this update, but I haven't been able to get it to fail after updating it Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-19 00:34:28 +00:00
Pawan Dhananjay	28238d97b1	Disconnect from peers quicker on internet issues (#2147 ) ## Issue Addressed Fixes #2146 ## Proposed Changes Change ping timeout errors to return `LowToleranceErrors` so that we disconnect faster on internet failures/changes.	2021-01-13 08:09:10 +00:00
realbigsean	423dea169c	update smallvec (#2152 ) ## Issue Addressed `cargo audit` is failing because of a potential for an overflow in the version of `smallvec` we're using ## Proposed Changes Update to the latest version of `smallvec`, which has the fix Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-11 23:32:11 +00:00
Arthur Woimbée	851a4dca3c	replace tempdir by tempfile (#2143 ) ## Issue Addressed Fixes #2141 Remove [tempdir](https://docs.rs/tempdir/0.3.7/tempdir/) in favor of [tempfile](https://docs.rs/tempfile/3.1.0/tempfile/). ## Proposed Changes `tempfile` has a slightly different api that makes creating temp folders with a name prefix a chore (`tempdir::TempDir::new("toto")` => `tempfile::Builder::new().prefix("toto").tempdir()`). So I removed temp folder name prefix where I deemed it not useful. Otherwise, the functionality is the same.	2021-01-06 06:36:11 +00:00
Age Manning	7e4b190df0	Reduce ping interval (#2132 ) ## Issue Addressed #2123 ## Description Reduces the TCP ping interval to increase our responsiveness to peer liveness changes.	2021-01-06 04:35:52 +00:00
realbigsean	588b90157d	Ssz state api endpoint (#2111 ) ## Issue Addressed Catching up to a recently merged API spec PR: https://github.com/ethereum/eth2.0-APIs/pull/119 ## Proposed Changes - Return an SSZ beacon state on `/eth/v1/debug/beacon/states/{stateId}` when passed this header: `accept: application/octet-stream`. - requests to this endpoint with no `accept` header or an `accept` header and a value of `application/json` or `/` , or will result in a JSON response ## Additional Info Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-01-06 03:01:46 +00:00
Samuel E. Moelius	939fa717fd	`test_decode_malicious_status_message` improvements (#2104 ) ## Issue Addressed None ## Proposed Changes * Correct typo in one comment, elaborate some others. * Add asserts to ensure comments match code. * Eliminate one unnecessary `clone`. ## Additional Info None	2021-01-06 01:10:26 +00:00
Samuel E. Moelius	0245ddd37b	Fix typo in `ssz_snappy.rs` comment (#2103 ) ## Issue Addressed None ## Proposed Changes Correct a typo in `ssz_snappy.rs`. ## Additional Info Pedantry at it finest.	2021-01-06 01:10:24 +00:00
Paul Hauner	f183af20e3	Version v1.0.6 (#2126 ) ## Issue Addressed NA ## Proposed Changes - Bump versions - Run `cargo update` ## Additional Info NA	2020-12-28 23:38:02 +00:00
Akihito Nakano	78d17c3255	Tweak error messages for ease of investigation (#2122 ) ## Proposed Changes <!-- Please list or describe the changes introduced by this PR. --> Tweaked the error message for ease of investigation as `Failed to update eth1 cache` is used in multiple places. 😃	2020-12-28 01:25:33 +00:00
Paul Hauner	9ed65a64f8	Version v1.0.5 (#2117 ) ## Issue Addressed NA ## Proposed Changes - Bump versions to `v1.0.5` - Run `cargo update` ## Additional Info NA	2020-12-23 18:52:48 +00:00

1 2 3 4 5 ...

1537 Commits