lighthouse

Author	SHA1	Message	Date
Michael Sproul	36bd4d87f0	Update to spec v1.0.0-rc.0 and BLSv4 (#1765 ) ## Issue Addressed Closes #1504 Closes #1505 Replaces #1703 Closes #1707 ## Proposed Changes * Update BLST and Milagro to versions compatible with BLSv4 spec * Update Lighthouse to spec v1.0.0-rc.0, and update EF test vectors * Use the v1.0.0 constants for `MainnetEthSpec`. * Rename `InteropEthSpec` -> `V012LegacyEthSpec` * Change all constants to suit the mainnet `v0.12.3` specification (i.e., Medalla). * Deprecate the `--spec` flag for the `lighthouse` binary * This value is now obtained from the `config_name` field of the `YamlConfig`. * Built in testnet YAML files have been updated. * Ignore the `--spec` value, if supplied, log a warning that it will be deprecated * `lcli` still has the spec flag, that's fine because it's dev tooling. * Remove the `E: EthSpec` from `YamlConfig` * This means we need to deser the genesis `BeaconState` on-demand, but this is fine. * Swap the old "minimal", "mainnet" strings over to the new `EthSpecId` enum. * Always require a `CONFIG_NAME` field in `YamlConfig` (it used to have a default). ## Additional Info Lots of breaking changes, do not merge! ~~We will likely need a Lighthouse v0.4.0 branch, and possibly a long-term v0.3.0 branch to keep Medalla alive~~. Co-authored-by: Kirk Baird <baird.k@outlook.com> Co-authored-by: Paul Hauner <paul@paulhauner.com>	2020-10-28 22:19:38 +00:00
divma	ad846ad280	Inform peers of requests that exceed the maximum rate limit + log downgrade (#1830 ) ## Issue Addressed #1825 ## Proposed Changes Since we penalize more blocks by range requests that have large steps, it is possible to get requests that will never be processed. We were not informing peers about this requests and also logging CRIT that is no longer relevant. Later we should check if more sophisticated handling for those requests is needed	2020-10-27 11:46:38 +00:00
Paul Hauner	92c8eba8ca	Ensure eth1 deposit/chain IDs are used from YamlConfig (#1829 ) ## Issue Addressed NA ## Proposed Changes Fixes a bug which causes the node to reject valid eth1 nodes. - Fix core bug: failure to apply `YamlConfig` values to `ChainSpec`. - Add a test to prevent regression in this specific case. - Fix an invalid log message ## Additional Info NA	2020-10-26 03:34:14 +00:00
Paul Hauner	f157d61cc7	Address clippy lints, panic in ssz_derive on overflow (#1714 ) ## Issue Addressed NA ## Proposed Changes - Panic or return error if we overflow `usize` in SSZ decoding/encoding derive macros. - I claim that the panics can only be triggered by a faulty type definition in lighthouse, they cannot be triggered externally on a validly defined struct. - Use `Ordering` instead of some `if` statements, as demanded by clippy. - Remove some old clippy `allow` that seem to no longer be required. - Add comments to interesting clippy statements that we're going to continue to ignore. - Create #1713 ## Additional Info NA	2020-10-25 23:27:39 +00:00
Paul Hauner	eba51f0973	Update testnet configs, change on-disk format (#1799 ) ## Issue Addressed - Related to #1691 ## Proposed Changes - Add `DEPOSIT_CHAIN_ID` and `DEPOSIT_NETWORK_ID` to `config.yaml`. - Pass the `DEPOSIT_NETWORK_ID` to the `eth1::Service`. - Remove the unused `MAX_EPOCHS_PER_CROSSLINK` from the `altona` and `medalla` configs (see [spec commit](`2befe90032 (diff-efb845ac2ebd4aafbc23df40f47ce25699255064e99d36d0406d0a14ca7953ec)`)). - Change from compressing the whole testnet directory, to only compressing the genesis state file. This is the only file we need to compress and not compressing the others makes them work nicely with git. - We can modify the boot nodes, configs, etc. without incurring an eternal binary-blob cost on our git history. - This change is backwards compatible (i.e., non-breaking). ## Additional Info NA	2020-10-25 22:15:46 +00:00
Age Manning	7453f39d68	Prevent unbanning of disconnected peers (#1822 ) ## Issue Addressed Further testing revealed another edge case where we attempt to unban a peer that can be in a disconnected start. Although this causes no real issue, it does log an error to the user. This PR adds a check to prevent this edge case and prevents the error being logged to the user.	2020-10-24 05:24:20 +00:00
Age Manning	a3cc1a1e0f	Call unban only when necessary (#1821 ) This PR prevents a user-facing error. It prevents optimistically unbanning a peer and instead checks the state of the peer before requesting the peers state to be unbanned.	2020-10-24 03:24:19 +00:00
blacktemplar	1644289a08	Updates the libp2p to the second newest commit => Allow only one topic per message (#1819 ) As @AgeManning mentioned the newest libp2p version had some problems and got downgraded again on lighthouse master. This is an intermediate version that makes no problems and only adds a small change of allowing only one topic per message.	2020-10-24 01:05:37 +00:00
Age Manning	7870b81ade	Downgrade libp2p (#1817 ) ## Description This downgrades the recent libp2p upgrade. There were issues with the RPC which prevented syncing of the chain and this upgrade needs to be further investigated.	2020-10-23 09:33:59 +00:00
Age Manning	55eee18ebb	Version bump to 0.3.1 (#1813 ) ## Description Bumps Lighthouse to version 0.3.1.	2020-10-23 04:16:36 +00:00
Age Manning	64c5899d25	Adds colour help to bn and vc subcommands (#1811 ) Adds coloured help to the bn and vc subcommands	2020-10-23 04:16:34 +00:00
Age Manning	2c7f362908	Discovery v5.1 (#1786 ) ## Overview This updates lighthouse to discovery v5.1 Note: This makes lighthouse's discovery not compatible with any previous version. Lighthouse cannot discover peers or send/receive ENR's from any previous version. This is a breaking change. This resolves #1605	2020-10-23 04:16:33 +00:00
Age Manning	ae96dab5d2	Increase UPnP logging and decrease batch sizes (#1812 ) ## Description This increases the logging of the underlying UPnP tasks to inform the user of UPnP error/success. This also decreases the batch syncing size to two epochs per batch.	2020-10-23 03:01:33 +00:00
Age Manning	c49dd94e20	Update to latest libp2p (#1810 ) ## Description Updates to the latest libp2p and includes gossipsub updates. Of particular note is the limitation of a single topic per gossipsub message. Co-authored-by: blacktemplar <blacktemplar@a1.net>	2020-10-23 03:01:31 +00:00
Michael Sproul	acd49d988d	Implement database temp states to reduce memory usage (#1798 ) ## Issue Addressed Closes #800 Closes #1713 ## Proposed Changes Implement the temporary state storage algorithm described in #800. Specifically: * Add `DBColumn::BeaconStateTemporary`, for storing 0-length temporary marker values. * Store intermediate states immediately as they are created, marked temporary. Delete the temporary flag if the block is processed successfully. * Add a garbage collection process to delete leftover temporary states on start-up. * Bump the database schema version to 2 so that a DB with temporary states can't accidentally be used with older versions of the software. The auto-migration is a no-op, but puts in place some infra that we can use for future migrations (e.g. #1784) ## Additional Info There are two known race conditions, one potentially causing permanent faults (hopefully rare), and the other insignificant. ### Race 1: Permanent state marked temporary EDIT: this has been fixed by the addition of a lock around the relevant critical section There are 2 threads that are trying to store 2 different blocks that share some intermediate states (e.g. they both skip some slots from the current head). Consider this sequence of events: 1. Thread 1 checks if state `s` already exists, and seeing that it doesn't, prepares an atomic commit of `(s, s_temporary_flag)`. 2. Thread 2 does the same, but also gets as far as committing the state txn, finishing the processing of its block, and _deleting_ the temporary flag. 3. Thread 1 is (finally) scheduled again, and marks `s` as temporary with its transaction. 4. a) The process is killed, or thread 1's block fails verification and the temp flag is not deleted. This is a permanent failure! Any attempt to load state `s` will fail... hope it isn't on the main chain! Alternatively (4b) happens... b) Thread 1 finishes, and re-deletes the temporary flag. In this case the failure is transient, state `s` will disappear temporarily, but will come back once thread 1 finishes running. I _hope_ that steps 1-3 only happen very rarely, and 4a even more rarely. It's hard to know This once again begs the question of why we're using LevelDB (#483), when it clearly doesn't care about atomicity! A ham-fisted fix would be to wrap the hot and cold DBs in locks, which would bring us closer to how other DBs handle read-write transactions. E.g. [LMDB only allows one R/W transaction at a time](https://docs.rs/lmdb/0.8.0/lmdb/struct.Environment.html#method.begin_rw_txn). ### Race 2: Temporary state returned from `get_state` I don't think this race really matters, but in `load_hot_state`, if another thread stores a state between when we call `load_state_temporary_flag` and when we call `load_hot_state_summary`, then we could end up returning that state even though it's only a temporary state. I can't think of any case where this would be relevant, and I suspect if it did come up, it would be safe/recoverable (having data is safer than _not_ having data). This could be fixed by using a LevelDB read snapshot, but that would require substantial changes to how we read all our values, so I don't think it's worth it right now.	2020-10-23 01:27:51 +00:00
Age Manning	66f0cf4430	Improve peer handling (#1796 ) ## Issue Addressed Potentially resolves #1647 and sync stalls. ## Proposed Changes The handling of the state of banned peers was inadequate for the complex peerdb data structure. We store a limited number of disconnected and banned peers in the db. We were not tracking intermediate "disconnecting" states and the in some circumstances we were updating the peer state without informing the peerdb. This lead to a number of inconsistencies in the peer state. Further, the peer manager could ban a peer changing a peer's state from being connected to banned. In this circumstance, if the peer then disconnected, we didn't inform the application layer, which lead to applications like sync not being informed of a peers disconnection. This could lead to sync stalling and having to require a lighthouse restart. Improved handling for peer states and interactions with the peerdb is made in this PR.	2020-10-23 01:27:48 +00:00
Paul Hauner	b829257cca	Ssz state (#1749 ) ## Issue Addressed NA ## Proposed Changes Adds a `lighthouse/beacon/states/:state_id/ssz` endpoint to allow us to pull the genesis state from the API. ## Additional Info NA	2020-10-22 06:05:49 +00:00
Michael Sproul	7f73dccebc	Refine op pool pruning (#1805 ) ## Issue Addressed Closes #1769 Closes #1708 ## Proposed Changes Tweaks the op pool pruning so that the attestation pool is pruned against the wall-clock epoch instead of the finalized state's epoch. This should reduce the unbounded growth that we've seen during periods without finality. Also fixes up the voluntary exit pruning as raised in #1708.	2020-10-22 04:47:29 +00:00
Paul Hauner	a3704b971e	Support pre-flight CORS check (#1772 ) ## Issue Addressed - Resolves #1766 ## Proposed Changes - Use the `warp::filters::cors` filter instead of our work-around. ## Additional Info It's not trivial to enable/disable `cors` using `warp`, since using `routes.with(cors)` changes the type of `routes`. This makes it difficult to apply/not apply cors at runtime. My solution has been to always use the `warp::filters::cors` wrapper but when cors should be disabled, just pass the HTTP server listen address as the only permissible origin.	2020-10-22 04:47:27 +00:00
realbigsean	a3552a4b70	Node endpoints (#1778 ) ## Issue Addressed `node` endpoints in #1434 ## Proposed Changes Implement these: ``` /eth/v1/node/health /eth/v1/node/peers/{peer_id} /eth/v1/node/peers ``` - Add an `Option<Enr>` to `PeerInfo` - Finish implementation of `/eth/v1/node/identity` ## Additional Info - should update the `peers` endpoints when #1764 is resolved Co-authored-by: realbigsean <seananderson33@gmail.com>	2020-10-22 02:59:42 +00:00
Daniel Schonfeld	8f86baa48d	Optimize attester slashing (#1745 ) ## Issue Addressed Closes #1548 ## Proposed Changes Optimizes attester slashing choice by choosing the ones that cover the most amount of validators slashed, with the highest effective balances ## Additional Info Initial pass, need to write a test for it	2020-10-22 01:43:54 +00:00
divma	668513b67e	Sync state adjustments (#1804 ) check for advanced peers and the state of the chain wrt the clock slot to decide if a chain is or not synced /transitioning to a head sync. Also a fix that prevented getting the right state while syncing heads	2020-10-22 00:26:06 +00:00
realbigsean	628891df1d	fix genesis state root provided to HTTP server (#1783 ) ## Issue Addressed Resolves #1776 ## Proposed Changes The beacon chain builder was using the canonical head's state root for the `genesis_state_root` field. ## Additional Info	2020-10-21 23:15:30 +00:00
realbigsean	fdb9744759	use head slot instead of the target slot for the not_while_syncing fi… (#1802 ) ## Issue Addressed Resolves #1792 ## Proposed Changes Use `chain.best_slot()` instead of the sync state's target slot in the `not_while_syncing_filter` ## Additional Info N/A	2020-10-21 22:02:25 +00:00
divma	2acf75785c	More sync updates (#1791 ) ## Issue Addressed #1614 and a couple of sync-stalling problems, the most important is a cyclic dependency between the sync manager and the peer manager	2020-10-20 22:34:18 +00:00
Michael Sproul	703c33bdc7	Fix head tracker concurrency bugs (#1771 ) ## Issue Addressed Closes #1557 ## Proposed Changes Modify the pruning algorithm so that it mutates the head-tracker _before_ committing the database transaction to disk, and _only if_ all the heads to be removed are still present in the head-tracker (i.e. no concurrent mutations). In the process of writing and testing this I also had to make a few other changes: * Use internal mutability for all `BeaconChainHarness` functions (namely the RNG and the graffiti), in order to enable parallel calls (see testing section below). * Disable logging in harness tests unless the `test_logger` feature is turned on And chose to make some clean-ups: * Delete the `NullMigrator` * Remove type-based configuration for the migrator in favour of runtime config (simpler, less duplicated code) * Use the non-blocking migrator unless the blocking migrator is required. In the store tests we need the blocking migrator because some tests make asserts about the state of the DB after the migration has run. * Rename `validators_keypairs` -> `validator_keypairs` in the `BeaconChainHarness` ## Testing To confirm that the fix worked, I wrote a test using [Hiatus](https://crates.io/crates/hiatus), which can be found here: https://github.com/michaelsproul/lighthouse/tree/hiatus-issue-1557 That test can't be merged because it inserts random breakpoints everywhere, but if you check out that branch you can run the test with: ``` $ cd beacon_node/beacon_chain $ cargo test --release --test parallel_tests --features test_logger ``` It should pass, and the log output should show: ``` WARN Pruning deferred because of a concurrent mutation, message: this is expected only very rarely! ``` ## Additional Info This is a backwards-compatible change with no impact on consensus.	2020-10-19 05:58:39 +00:00
blacktemplar	6ba997b88e	add direction information to PeerInfo (#1768 ) ## Issue Addressed NA ## Proposed Changes Adds a direction field to `PeerConnectionStatus` that can be accessed by calling `is_outgoing` which will return `true` iff the peer is connected and the first connection was an outgoing one.	2020-10-16 05:24:21 +00:00
Herman Junge	d7b9d0dd9f	Implement matches! macro (#1777 ) Fix #1775	2020-10-15 21:42:43 +00:00
Pawan Dhananjay	97be2ca295	Simulator and attestation service fixes (#1747 ) ## Issue Addressed #1729 #1730 Which issue # does this PR address? ## Proposed Changes 1. Fixes a bug in the simulator where nodes can't find each other due to 0 udp ports in their enr. 2. Fixes bugs in attestation service where we are unsubscribing from a subnet prematurely. More testing is needed for attestation service fixes.	2020-10-15 07:11:31 +00:00
blacktemplar	a0634cc64f	Gossipsub topic filters (#1767 ) ## Proposed Changes Adds a gossipsub topic filter that only allows subscribing and incoming subscriptions from valid ETH2 topics. ## Additional Info Currently the preparation of the valid topic hashes uses only the current fork id but in the future it must also use all possible future fork ids for planned forks. This has to get added when hard coded forks get implemented. DO NOT MERGE: We first need to merge the libp2p changes (see https://github.com/sigp/rust-libp2p/pull/70) so that we can refer from here to a commit hash inside the lighthouse branch.	2020-10-14 10:12:57 +00:00
blacktemplar	8248afa793	Updates the message-id according to the Networking Spec (#1752 ) ## Proposed Changes Implement the new message id function (see https://github.com/ethereum/eth2.0-specs/pull/2089) using an additional fast message id function for better performance + caching decompressed data.	2020-10-14 06:51:58 +00:00
Pawan Dhananjay	99a02fd2ab	Limit snappy input stream (#1738 ) ## Issue Addressed N/A ## Proposed Changes This PR limits the length of the stream received by the snappy decoder to be the maximum allowed size for the received rpc message type. Also adds further checks to ensure that the length specified in the rpc [encoding-dependent header](https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#encoding-strategies) is within the bounds for the rpc message type being decoded.	2020-10-11 22:45:33 +00:00
Paul Hauner	0e4cc50262	Remove unused deps	2020-10-09 15:58:20 +11:00
Paul Hauner	db3e0578e9	Merge branch 'v0.3.0-staging' into v3-master	2020-10-09 15:27:08 +11:00
Paul Hauner	72cc5e35af	Bump version to v0.3.0 (#1743 ) ## Issue Addressed NA ## Proposed Changes - Bump version to v0.3.0 - Run `cargo update` ## Additional Info NA	2020-10-09 02:05:30 +00:00
Paul Hauner	da44821e39	Clean up obsolete TODOs (#1734 ) Squashed commit of the following: commit f99373cbaec9adb2bdbae3f7e903284327962083 Author: Age Manning <Age@AgeManning.com> Date: Mon Oct 5 18:44:09 2020 +1100 Clean up obsolute TODOs	2020-10-05 21:08:14 +11:00
Paul Hauner	ee7c8a0b7e	Update external deps (#1711 ) ## Issue Addressed - Resolves #1706 ## Proposed Changes Updates dependencies across the workspace. Any crate that was not able to be brought to the latest version is listed in #1712. ## Additional Info NA	2020-10-05 08:22:19 +00:00
Age Manning	240181e840	Upgrade discovery and restructure task execution (#1693 ) * Initial rebase * Remove old code * Correct release tests * Rebase commit * Remove eth2-testnet dep on eth2libp2p * Remove crates lost in rebase * Remove unused dep	2020-10-05 18:45:54 +11:00
Age Manning	bcb629564a	Improve error handling in network processing (#1654 ) * Improve error handling in network processing * Cargo fmt * Cargo fmt * Improve error handling for prior genesis * Remove dep	2020-10-05 17:34:56 +11:00
divma	113758a4f5	From panic to crit (#1726 ) ## Issue Addressed Downgrade inconsistent chain segment states from `panic` to `crit`. I don't love this solution but since range can always bounce back from any of those, we don't panic. Co-authored-by: Age Manning <Age@AgeManning.com>	2020-10-05 17:34:49 +11:00
Age Manning	a8c5af8874	Increase content-id length (#1725 ) ## Issue Addressed N/A ## Proposed Changes Increase gossipsub's content-id length to the full 32 byte hash. ## Additional Info N/A	2020-10-05 17:33:42 +11:00
divma	6997776494	Sync fixes (#1716 ) ## Issue Addressed chain state inconsistencies ## Proposed Changes - a batch can be fake-failed by Range if it needs to move a peer to another chain. The peer will still send blocks/ errors / produce timeouts for those requests, so check when we get a response from the RPC that the request id matches, instead of only the peer, since a re-request can be directed to the same peer. - if an optimistic batch succeeds, store the attempt to avoid trying it again when quickly switching chains. Also, use it only if ahead of our current target, instead of the segment's start epoch	2020-10-05 17:33:36 +11:00
Paul Hauner	e7eb99cb5e	Use Drop impl to send worker idle message (#1718 ) ## Issue Addressed NA ## Proposed Changes Uses a `Drop` implementation to help ensure that `BeaconProcessor` workers are freed. This will help prevent against regression, if someone happens to add an early return and it will also help in the case of a panic. ## Additional Info NA	2020-10-05 17:33:25 +11:00
Age Manning	fe07a3c21c	Improve error handling in network processing (#1654 ) * Improve error handling in network processing * Cargo fmt * Cargo fmt * Improve error handling for prior genesis * Remove dep	2020-10-05 17:30:43 +11:00
Age Manning	47c921f326	Update libp2p (#1728 ) ## Issue Addressed N/A ## Proposed Changes Updates the libp2p dependency to the latest version ## Additional Info N/A	2020-10-05 05:16:27 +00:00
divma	b1c121b880	From panic to crit (#1726 ) ## Issue Addressed Downgrade inconsistent chain segment states from `panic` to `crit`. I don't love this solution but since range can always bounce back from any of those, we don't panic. Co-authored-by: Age Manning <Age@AgeManning.com>	2020-10-05 04:02:09 +00:00
Age Manning	6b68c628df	Increase content-id length (#1725 ) ## Issue Addressed N/A ## Proposed Changes Increase gossipsub's content-id length to the full 32 byte hash. ## Additional Info N/A	2020-10-04 23:49:16 +00:00
divma	86a18e72c4	Sync fixes (#1716 ) ## Issue Addressed chain state inconsistencies ## Proposed Changes - a batch can be fake-failed by Range if it needs to move a peer to another chain. The peer will still send blocks/ errors / produce timeouts for those requests, so check when we get a response from the RPC that the request id matches, instead of only the peer, since a re-request can be directed to the same peer. - if an optimistic batch succeeds, store the attempt to avoid trying it again when quickly switching chains. Also, use it only if ahead of our current target, instead of the segment's start epoch	2020-10-04 23:49:14 +00:00
divma	e3c7b58657	Address a couple of TODOs (#1724 ) ## Issue Addressed couple of TODOs	2020-10-04 22:50:44 +00:00
Paul Hauner	d72c026d32	Use Drop impl to send worker idle message (#1718 ) ## Issue Addressed NA ## Proposed Changes Uses a `Drop` implementation to help ensure that `BeaconProcessor` workers are freed. This will help prevent against regression, if someone happens to add an early return and it will also help in the case of a panic. ## Additional Info NA	2020-10-04 21:59:20 +00:00

1 2 3 4 5 ...

1399 Commits