lighthouse

Author	SHA1	Message	Date
Pawan Dhananjay	bb5285ac6d	Remove BeaconBlockAndBlobsSidecar from core topics (#4016 )	2023-02-22 09:45:38 +11:00
Paul Hauner	eed7d65ce7	Allow for withdrawals in max block size (#4011 ) * Allow for withdrawals in max block size * Ensure payload size is counted	2023-02-21 18:03:10 +11:00
Michael Sproul	066c27750a	Merge remote-tracking branch 'origin/staging' into capella-update	2023-02-17 12:05:36 +11:00
realbigsean	4d0b0f681d	merge self limiter	2023-02-15 14:25:58 -05:00
realbigsean	b805fa6279	merge with upstream	2023-02-15 14:20:12 -05:00
Michael Sproul	918b688f72	Simplify payload traits and reduce cloning (#3976 ) * Simplify payload traits and reduce cloning * Fix self limiter	2023-02-15 14:17:56 +11:00
Age Manning	8dd9249177	Enforce a timeout on peer disconnect (#3757 ) On heavily crowded networks, we are seeing many attempted connections to our node every second. Often these connections come from peers that have just been disconnected. This can be for a number of reasons including: - We have deemed them to be not as useful as other peers - They have performed poorly - They have dropped the connection with us - The connection was spontaneously lost - They were randomly removed because we have too many peers In all of these cases, if we have reached or exceeded our target peer limit, there is no desire to accept new connections immediately after the disconnect from these peers. In fact, it often costs us resources to handle the established connections and defeats some of the logic of dropping them in the first place. This PR adds a timeout, that prevents recently disconnected peers from reconnecting to us. Technically we implement a ban at the swarm layer to prevent immediate re connections for at least 10 minutes. I decided to keep this light, and use a time-based LRUCache which only gets updated during the peer manager heartbeat to prevent added stress of polling a delay map for what could be a large number of peers. This cache is bounded in time. An extra space bound could be added should people consider this a risk. Co-authored-by: Diva M <divma@protonmail.com>	2023-02-14 03:25:42 +00:00
Michael Sproul	d53ccf8fc7	Placeholder for BlobsByRange outbound rate limit	2023-02-14 12:08:14 +11:00
Michael Sproul	18c8cab4da	Merge remote-tracking branch 'origin/unstable' into capella-merge	2023-02-14 12:07:27 +11:00
realbigsean	ad9af6d8b1	complete match for `has_context_bytes`	2023-02-13 16:44:54 -05:00
Emilia Hane	43bf908e7a	Fix release tests	2023-02-10 15:34:59 +01:00
Emilia Hane	09370e70d9	Fix rebase conflicts	2023-02-10 09:41:19 +01:00
Divma	ceb986549d	Self rate limiting dev flag (#3928 ) ## Issue Addressed Adds self rate limiting options, mainly with the idea to comply with peer's rate limits in small testnets ## Proposed Changes Add a hidden flag `self-limiter` this can take no value, or customs values to configure quotas per protocol ## Additional Info ### How to use `--self-limiter` will turn on the self rate limiter applying the same params we apply to inbound requests (requests from other peers) `--self-limiter "beacon_blocks_by_range:64/1"` will turn on the self rate limiter for ALL protocols, but change the quota for bbrange to 64 requested blocks per 1 second. `--self-limiter "beacon_blocks_by_range:64/1;ping:1/10"` same as previous one, changing the quota for ping as well. ### Caveats - The rate limiter is either on or off for all protocols. I added the custom values to be able to change the quotas per protocol so that some protocols can be given extremely loose or tight quotas. I think this should satisfy every need even if we can't technically turn off rate limits per protocol. - This reuses the rate limiter struct for the inbound requests so there is this ugly part of the code in which we need to deal with the inbound only protocols (light client stuff) if this becomes too ugly as we add lc protocols, we might want to split the rate limiters. I've checked this and looks doable with const generics to avoid so much code duplication ### Knowing if this is on ``` Feb 06 21:12:05.493 DEBG Using self rate limiting params config: OutboundRateLimiterConfig { ping: 2/10s, metadata: 1/15s, status: 5/15s, goodbye: 1/10s, blocks_by_range: 1024/10s, blocks_by_root: 128/10s }, service: libp2p_rpc, service: libp2p ```	2023-02-08 02:18:53 +00:00
Diva M	493784366f	self rate limiting	2023-02-07 13:00:35 -05:00
realbigsean	b7e20fb87a	Update beacon_node/lighthouse_network/src/rpc/protocol.rs	2023-01-27 19:03:43 +01:00
Diva M	9976d3bbbc	send stream terminators	2023-01-27 18:11:26 +01:00
realbigsean	5e8d79891b	merge conflict resolution	2023-01-25 11:10:44 +01:00
realbigsean	5b4cd997d0	Update beacon_node/lighthouse_network/src/rpc/methods.rs	2023-01-24 12:20:40 +01:00
Diva M	2d2da92132	only support 4844 rpc methods if on 4844	2023-01-24 05:15:23 -05:00
realbigsean	b658cc7aaf	simplify checking attester cache for block and blobs. use ResourceUnavailable according to the spec	2023-01-24 10:50:47 +01:00
Emilia Hane	81a754577d	fixup! Improve error handling	2023-01-21 15:47:33 +01:00
realbigsean	06f71e8cce	merge capella	2023-01-12 12:51:09 -05:00
Michael Sproul	2af8110529	Merge remote-tracking branch 'origin/unstable' into capella Fixing the conflicts involved patching up some of the `block_hash` verification, the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870	2023-01-12 16:22:00 +11:00
realbigsean	438126f19a	merge upstream, fix compile errors	2023-01-11 13:52:58 -05:00
Age Manning	1d9a2022b4	Upgrade to libp2p v0.50.0 (#3764 ) I've needed to do this work in order to do some episub testing. This version of libp2p has not yet been released, so this is left as a draft for when we wish to update. Co-authored-by: Diva M <divma@protonmail.com>	2023-01-06 15:59:33 +00:00
Age Manning	4e5e7ee1fc	Restructure code for libp2p upgrade (#3850 ) Our custom RPC implementation is lagging from the libp2p v50 version. We are going to need to change a bunch of function names and would be nice to have consistent ordering of function names inside the handlers. This is a precursor to the libp2p upgrade to minimize merge conflicts in function ordering.	2023-01-05 17:18:24 +00:00
realbigsean	d8f7277beb	cleanup	2022-12-30 11:00:14 -05:00
Mark Mackey	c188cde034	merge upstream/unstable	2022-12-28 14:43:25 -06:00
realbigsean	8a70d80a2f	Revert "Revert "renames, remove , wrap BlockWrapper enum to make descontruction private"" This reverts commit `1931a442dc`.	2022-12-28 10:31:18 -05:00
realbigsean	1931a442dc	Revert "renames, remove , wrap BlockWrapper enum to make descontruction private" This reverts commit `5b3b34a9d7`.	2022-12-28 10:30:36 -05:00
realbigsean	5b3b34a9d7	renames, remove , wrap BlockWrapper enum to make descontruction private	2022-12-28 10:28:45 -05:00
Divma	240854750c	cleanup: remove unused imports, unusued fields (#3834 )	2022-12-23 17:16:10 -05:00
realbigsean	5db0a88d4f	fix compilation errors from merge	2022-12-23 10:27:01 -05:00
realbigsean	d504d51dd9	merge with upstream add context bytes to error log	2022-12-22 14:06:28 -05:00
realbigsean	33d01a7911	miscelaneous fixes on syncing, rpc and responding to peer's sync related requests (#3827 ) - there was a bug in responding range blob requests where we would incorrectly label the first slot of an epoch as a non-skipped slot if it were skipped. this bug did not exist in the code for responding to block range request because the logic error was mitigated by defensive coding elsewhere - there was a bug where a block received during range sync without a corresponding blob (and vice versa) was incorrectly interpreted as a stream termination - RPC size limit fixes. - Our blob cache was dead locking so I removed use of it for now. - Because of our change in finalized sync batch size from 2 to 1 and our transition to using exact epoch boundaries for batches (rather than one slot past the epoch boundary), we need to sync finalized sync to 2 epochs + 1 slot past our peer's finalized slot in order to finalize the chain locally. - use fork context bytes in rpc methods on both the server and client side	2022-12-21 15:50:51 -05:00
realbigsean	ff772311fa	add context bytes to blob messages, fix rpc limits, sync past finalized checkpoint during finalized sync so we can advance our own finalization, fix stream termination bug in blobs by range	2022-12-21 13:56:52 -05:00
realbigsean	a67fa516c7	don't expect context bytes for blob messages	2022-12-20 19:32:54 -05:00
realbigsean	9c46a1cb21	fix rate limits, and a couple other bugs	2022-12-20 18:56:07 -05:00
realbigsean	5de4f5b8d0	handle parent blob request edge cases correctly. fix data availability boundary check	2022-12-19 11:39:09 -05:00
realbigsean	0349b104bf	add blob rpc protocols to	2022-12-16 14:28:14 -05:00
Divma	ffbf70e2d9	Clippy lints for rust 1.66 (#3810 ) ## Issue Addressed Fixes the new clippy lints for rust 1.66 ## Proposed Changes Most of the changes come from: - [unnecessary_cast](https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast) - [iter_kv_map](https://rust-lang.github.io/rust-clippy/master/index.html#iter_kv_map) - [needless_borrow](https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow) ## Additional Info na	2022-12-16 04:04:00 +00:00
realbigsean	1644978cdb	fix compilation	2022-12-15 10:26:10 -05:00
realbigsean	d893706e0e	merge with capella	2022-12-15 09:33:18 -05:00
Michael Sproul	991e4094f8	Merge remote-tracking branch 'origin/unstable' into capella-update	2022-12-14 13:00:41 +11:00
ethDreamer	b1c33361ea	Fixed Clippy Complaints & Some Failing Tests (#3791 ) * Fixed Clippy Complaints & Some Failing Tests * Update Dockerfile to Rust-1.65 * EF test file renamed * Touch up comments based on feedback	2022-12-13 10:50:24 -06:00
GeemoCandama	1b28ef8a8d	Adding light_client gossip topics (#3693 ) ## Issue Addressed Implementing the light_client_gossip topics but I'm not there yet. Which issue # does this PR address? Partially #3651 ## Proposed Changes Add light client gossip topics. Please list or describe the changes introduced by this PR. I'm going to Implement light_client_finality_update and light_client_optimistic_update gossip topics. Currently I've attempted the former and I'm seeking feedback. ## Additional Info I've only implemented the light_client_finality_update topic because I wanted to make sure I was on the correct path. Also checking that the gossiped LightClientFinalityUpdate is the same as the locally constructed one is not implemented because caching the updates will make this much easier. Could someone give me some feedback on this please? Please provide any additional information. For example, future considerations or information useful for reviewers. Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>	2022-12-13 06:24:51 +00:00
realbigsean	a0d4aecf30	requests block + blob always post eip4844	2022-12-07 15:30:08 -05:00
realbigsean	6c8b1b323b	merge upstream	2022-12-07 12:27:21 -05:00
ethDreamer	1a39976715	Fixed Compiler Warnings & Failing Tests (#3771 )	2022-12-03 10:42:12 +11:00
realbigsean	8102a01085	merge with upstream	2022-12-01 11:13:07 -05:00
Mark Mackey	8a04c3428e	Merged with `unstable`	2022-11-30 17:29:10 -06:00
Diva M	979a95d62f	handle unknown parents for block-blob pairs wip handle unknown parents for block-blob pairs	2022-11-30 17:21:54 -05:00
GeemoCandama	3534c85e30	Optimize finalized chain sync by skipping newPayload messages (#3738 ) ## Issue Addressed #3704 ## Proposed Changes Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status. ## Additional Info I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.	2022-11-29 08:19:27 +00:00
Age Manning	2779017076	Gossipsub fast message id change (#3755 ) For improved consistency, this mixes in the topic into our fast message id for more consistent tracking of messages across topics.	2022-11-28 07:36:52 +00:00
realbigsean	3c9e1abcb7	merge upstream	2022-11-26 10:01:57 -05:00
sean	07a79c8266	add block and blobs sidecar to whitelist	2022-11-25 14:44:57 +00:00
sean	f88caa7afc	set quota for blobs by root	2022-11-25 14:30:51 +00:00
antondlr	e9bf7f7cc1	remove commas from comma-separated kv pairs (#3737 ) ## Issue Addressed Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing	2022-11-25 07:57:10 +00:00
Giulio rebuffo	d5a2de759b	Added LightClientBootstrap V1 (#3711 ) ## Issue Addressed Partially addresses #3651 ## Proposed Changes Adds server-side support for light_client_bootstrap_v1 topic ## Additional Info This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice. Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>	2022-11-25 05:19:00 +00:00
Michael Sproul	788b337951	Op pool and gossip for BLS to execution changes (#3726 )	2022-11-25 07:09:26 +11:00
realbigsean	48b2efce9f	merge with upstream	2022-11-22 18:38:30 -05:00
Michael Sproul	61b4bbf870	Fix BlocksByRoot response types (#3743 )	2022-11-22 12:29:47 -05:00
Diva M	7ed2d35424	get it to compile	2022-11-21 14:53:33 -05:00
realbigsean	e7ee79185b	add blobs cache and fix some block production	2022-11-21 14:09:06 -05:00
realbigsean	dc87156641	block and blob handling progress	2022-11-19 16:53:34 -05:00
realbigsean	45897ad4e1	remove blob wrapper	2022-11-19 15:18:42 -05:00
realbigsean	7162e5e23b	add a bunch of blob coupling boiler plate, add a blobs by root request	2022-11-15 16:43:56 -05:00
Pawan Dhananjay	857ef25d28	Add metrics for subnet queries (#3721 ) ## Issue Addressed N/A ## Proposed Changes Add metrics for peers discovered in subnet discv5 queries.	2022-11-15 13:25:38 +00:00
Michael Sproul	713b6a18d4	Simplify GossipTopic -> String conversion (#3722 ) ## Proposed Changes With a few different changes to the gossip topics in flight (light clients, Capella, 4844, etc) I think this simplification makes sense. I noticed it while plumbing through a new Capella topic.	2022-11-15 05:21:48 +00:00
Age Manning	230168deff	Health Endpoints for UI (#3668 ) This PR adds some health endpoints for the beacon node and the validator client. Specifically it adds the endpoint: `/lighthouse/ui/health` These are not entirely stable yet. But provide a base for modification for our UI. These also may have issues with various platforms and may need modification.	2022-11-15 05:21:26 +00:00
Michael Sproul	0cdd049da9	Fixes to make EF Capella tests pass (#3719 ) * Fixes to make EF Capella tests pass * Clippy for state_processing	2022-11-14 13:14:31 -06:00
realbigsean	fe04d945cc	make signed block + sidecar consensus spec	2022-11-10 14:22:30 -05:00
realbigsean	bc0af72c74	fix topic name	2022-11-07 12:36:31 -05:00
realbigsean	1aec17b09c	Merge branch 'unstable' of https://github.com/sigp/lighthouse into eip4844	2022-11-04 13:23:55 -04:00
Divma	8600645f65	Fix rust 1.65 lints (#3682 ) ## Issue Addressed New lints for rust 1.65 ## Proposed Changes Notable change is the identification or parameters that are only used in recursion ## Additional Info na	2022-11-04 07:43:43 +00:00
realbigsean	8656d23327	merge with unstable	2022-11-01 13:18:00 -04:00
Pawan Dhananjay	29f2ec46d3	Couple blocks and blobs in gossip (#3670 ) * Revert "Add more gossip verification conditions" This reverts commit `1430b561c3`. * Revert "Add todos" This reverts commit `91efb9d4c7`. * Revert "Reprocess blob sidecar messages" This reverts commit `21bf3d37cd`. * Add the coupled topic * Decode SignedBeaconBlockAndBlobsSidecar correctly * Process Block and Blobs in beacon processor * Remove extra blob publishing logic from vc * Remove blob signing in vc * Ugly hack to compile	2022-11-01 10:28:21 -04:00
Divma	46fbf5b98b	Update discv5 (#3171 ) ## Issue Addressed Updates discv5 Pending on - [x] #3547 - [x] Alex upgrades his deps ## Proposed Changes updates discv5 and the enr crate. The only relevant change would be some clear indications of ipv4 usage in lighthouse ## Additional Info Functionally, this should be equivalent to the prev version. As draft pending a discv5 release	2022-10-28 05:40:06 +00:00
realbigsean	137f230344	Capella eip 4844 cleanup (#3652 ) * add capella gossip boiler plate * get everything compiling Co-authored-by: realbigsean <sean@sigmaprime.io Co-authored-by: Mark Mackey <mark@sigmaprime.io> * small cleanup * small cleanup * cargo fix + some test cleanup * improve block production * add fixme for potential panic Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-10-26 15:15:26 -04:00
Divma	3a5888e53d	Ban and unban peers at the swarm level (#3653 ) ## Issue Addressed I missed this from https://github.com/sigp/lighthouse/pull/3491. peers were being banned at the behaviour level only. The identify errors are explained by this as well ## Proposed Changes Add banning and unbanning ## Additional Info Befor,e having tests that catch this was hard because the swarm was outside the behaviour. We could now have tests that prevent something like this in the future	2022-10-24 21:39:30 +00:00
Pawan Dhananjay	c55b28bf10	Minor fixes	2022-10-04 19:18:06 -05:00
realbigsean	7527c2b455	fix RPC limit add blob signing domain	2022-10-04 14:57:29 -04:00
realbigsean	ba16a037a3	cleanup	2022-10-04 09:34:05 -04:00
realbigsean	c0dc42ea07	cargo fmt	2022-10-04 08:21:46 -04:00
realbigsean	8d45e48775	cargo fix	2022-10-03 21:52:16 -04:00
realbigsean	e81dbbfea4	compile	2022-10-03 21:48:02 -04:00
realbigsean	88006735c4	compile	2022-10-03 10:06:04 -04:00
realbigsean	7520651515	cargo fix and some test fixes	2022-09-29 12:43:35 -04:00
realbigsean	fe6fc55449	fix compilation errors, rename capella -> shanghai, cleanup some rebase issues	2022-09-29 12:43:13 -04:00
realbigsean	3f1e5cee78	Some gossip work	2022-09-29 12:35:53 -04:00
realbigsean	4008da6c60	sync tx blobs	2022-09-29 12:32:55 -04:00
realbigsean	4cdf1b546d	add shanghai fork version and epoch	2022-09-29 12:28:58 -04:00
realbigsean	de44b300c0	add/update types	2022-09-29 12:25:56 -04:00
Age Manning	27bb9ff07d	Handle Lodestar's new agent string (#3620 ) ## Issue Addressed #3561 ## Proposed Changes Recognize Lodestars new agent string and appropriately count these peers as lodestar peers.	2022-09-29 01:50:13 +00:00
Divma	b1d2510d1b	Libp2p v0.48.0 upgrade (#3547 ) ## Issue Addressed Upgrades libp2p to v.0.47.0. This is the compilation of - [x] #3495 - [x] #3497 - [x] #3491 - [x] #3546 - [x] #3553 Co-authored-by: Age Manning <Age@AgeManning.com>	2022-09-29 01:50:11 +00:00
Marius van der Wijden	8b71b978e0	new round of hacks (config etc)	2022-09-17 23:42:49 +02:00
Marius van der Wijden	aeb52ff186	network stuff	2022-09-17 16:10:42 +02:00
Marius van der Wijden	36a0add0cd	network stuff	2022-09-17 15:23:28 +02:00
Daniel Knopik	0518665949	Merge remote-tracking branch 'fork/eip4844' into eip4844	2022-09-17 14:58:33 +02:00
Daniel Knopik	292a16a6eb	gossip boilerplate	2022-09-17 14:58:27 +02:00
Marius van der Wijden	acace8ab31	network: blobs by range message	2022-09-17 14:55:18 +02:00
Daniel Knopik	bcc738cb9d	progress on gossip stuff	2022-09-17 14:31:57 +02:00
Daniel Knopik	ca1e17b386	it compiles!	2022-09-17 12:23:03 +02:00
Divma	473abc14ca	Subscribe to subnets only when needed (#3419 ) ## Issue Addressed We currently subscribe to attestation subnets as soon as the subscription arrives (one epoch in advance), this makes it so that subscriptions for future slots are scheduled instead of done immediately. ## Proposed Changes - Schedule subscriptions to subnets for future slots. - Finish removing hashmap_delay, in favor of [delay_map](https://github.com/AgeManning/delay_map). This was the only remaining service to do this. - Subscriptions for past slots are rejected, before we would subscribe for one slot. - Add a new test for subscriptions that are not consecutive. ## Additional Info This is also an effort in making the code easier to understand	2022-09-05 00:22:48 +00:00
Pawan Dhananjay	f3439116da	Return ResourceUnavailable if we are unable to reconstruct execution payloads (#3365 ) ## Issue Addressed Resolves #3351 ## Proposed Changes Returns a `ResourceUnavailable` rpc error if we are unable to serve full payloads to blocks by root and range requests because the execution layer is not synced. ## Additional Info This PR also changes the penalties such that a `ResourceUnavailable` error is only penalized if it is an outgoing request. If we are syncing and aren't getting full block responses, then we don't have use for the peer. However, this might not be true for the incoming request case. We let the peer decide in this case if we are still useful or if we should be banned. cc @divagant-martian please let me know if i'm missing something here.	2022-07-27 03:20:00 +00:00
Justin Traglia	0f62d900fe	Fix some typos (#3376 ) ## Proposed Changes This PR fixes various minor typos in the project.	2022-07-27 00:51:06 +00:00
Akihito Nakano	98a9626ef5	Bump the MSRV to 1.62 and using `#[derive(Default)]` on enums (#3304 ) ## Issue Addressed N/A ## Proposed Changes Since Rust 1.62, we can use `#[derive(Default)]` on enums. ✨ https://blog.rust-lang.org/2022/06/30/Rust-1.62.0.html#default-enum-variants There are no changes to functionality in this PR, just replaced the `Default` trait implementation with `#[derive(Default)]`.	2022-07-15 07:31:19 +00:00
Paul Hauner	be4e261e74	Use async code when interacting with EL (#3244 ) ## Overview This rather extensive PR achieves two primary goals: 1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state. 2. Refactors fork choice, block production and block processing to `async` functions. Additionally, it achieves: - Concurrent forkchoice updates to the EL and cache pruning after a new head is selected. - Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production. - Concurrent per-block-processing and execution payload verification during block processing. - The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?): - I had to do this to deal with sending blocks into spawned tasks. - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones. - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap. - Avoids cloning all the blocks in every chain segment during sync. - It also has the potential to clean up our code where we need to pass an owned block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅) - The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs. For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273 ## Changes to `canonical_head` and `fork_choice` Previously, the `BeaconChain` had two separate fields: ``` canonical_head: RwLock<Snapshot>, fork_choice: RwLock<BeaconForkChoice> ``` Now, we have grouped these values under a single struct: ``` canonical_head: CanonicalHead { cached_head: RwLock<Arc<Snapshot>>, fork_choice: RwLock<BeaconForkChoice> } ``` Apart from ergonomics, the only actual change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously. ## Breaking Changes ### The `state` (root) field in the `finalized_checkpoint` SSE event Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event: 1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`. 4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots. Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](`de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)`) it uses [`getStateRootFromBlockRoot`](`de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)`) which uses (1). I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku. ## Notes for Reviewers I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct. I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking". I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run at least once means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows when it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it. I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around. Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2. You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests: - Changing tests to be `tokio::async` tests. - Adding `.await` to fork choice, block processing and block production functions. - Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`. - Wrapping `SignedBeaconBlock` in an `Arc`. - In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant. I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic. Co-authored-by: Mac L <mjladson@pm.me>	2022-07-03 05:36:50 +00:00
Akihito Nakano	082ed35bdc	Test the pruning of excess peers using randomly generated input (#3248 ) ## Issue Addressed https://github.com/sigp/lighthouse/issues/3092 ## Proposed Changes Added property-based tests for the pruning implementation. A randomly generated input for the test contains connection direction, subnets, and scores. ## Additional Info I left some comments on this PR, what I have tried, and [a question](https://github.com/sigp/lighthouse/pull/3248#discussion_r891981969). Co-authored-by: Diva M <divma@protonmail.com>	2022-06-25 22:22:34 +00:00
Divma	7af5742081	Deprecate step param in BlocksByRange RPC request (#3275 ) ## Issue Addressed Deprecates the step parameter in the blocks by range request ## Proposed Changes - Modifies the BlocksByRangeRequest type to remove the step parameter and everywhere we took it into account before - Adds a new type to still handle coding and decoding of requests that use the parameter ## Additional Info I went with a deprecation over the type itself so that requests received outside `lighthouse_network` don't even need to deal with this parameter. After the deprecation period just removing the Old blocks by range request should be straightforward	2022-06-22 16:23:34 +00:00
Divma	3dd50bda11	Improve substream management (#3261 ) ## Issue Addressed Which issue # does this PR address? ## Proposed Changes Please list or describe the changes introduced by this PR. ## Additional Info Please provide any additional information. For example, future considerations or information useful for reviewers.	2022-06-10 06:58:50 +00:00
Akihito Nakano	a6d2ed6119	Fix: PeerManager doesn't remove "outbound only" peers which should be pruned (#3236 ) ## Issue Addressed This is one step to address https://github.com/sigp/lighthouse/issues/3092 before introducing `quickcheck`. I noticed an issue while I was reading the pruning implementation `PeerManager::prune_excess_peers()`. If a peer with the following condition, `outbound_peers_pruned` counter increases but the peer is not pushed to `peers_to_prune`. - [outbound only](`1e4ac8a4b9/beacon_node/lighthouse_network/src/peer_manager/mod.rs (L1018)`) - [min_subnet_count <= MIN_SYNC_COMMITTEE_PEERS](`1e4ac8a4b9/beacon_node/lighthouse_network/src/peer_manager/mod.rs (L1047)`) As a result, PeerManager doesn't remove "outbound" peers which should be pruned. Note: [`subnet_to_peer`](`e0d673ea86/beacon_node/lighthouse_network/src/peer_manager/mod.rs (L999)`) (HashMap) doesn't guarantee a particular order of iteration. So whether the test fails depend on the order of iteration.	2022-06-06 05:51:10 +00:00
Akihito Nakano	695f415590	Tiny improvement: PeerManager and maximum discovery query (#3182 ) ## Issue Addressed As [`Discovery` bounds the maximum discovery query](`e88b18be09/beacon_node/lighthouse_network/src/discovery/mod.rs (L328)`), `PeerManager` no need to handle it. `e88b18be09/beacon_node/lighthouse_network/src/discovery/mod.rs (L328)`	2022-05-19 06:00:46 +00:00
François Garillot	3f9e83e840	[refactor] Refactor Option/Result combinators (#3180 ) Code simplifications using `Option`/`Result` combinators to make pattern-matches a tad simpler. Opinions on these loosely held, happy to adjust in review. Tool-aided by [comby-rust](https://github.com/huitseeker/comby-rust).	2022-05-16 01:59:47 +00:00
Pawan Dhananjay	db0beb5178	Poll shutdown timeout in rpc handler (#3153 ) ## Issue Addressed N/A ## Proposed Changes Previously, we were using `Sleep::is_elapsed()` to check if the shutdown timeout had triggered without polling the sleep. This PR polls the sleep timer.	2022-04-13 03:54:44 +00:00
Divma	580d2f7873	log upgrades + prevent dialing of disconnecting peers (#3148 ) ## Issue Addressed We still ping peers that are considered in a disconnecting state ## Proposed Changes Do not ping peers once we decide they are disconnecting Upgrade logs about ignored rpc messages ## Additional Info --	2022-04-13 03:54:43 +00:00
Pawan Dhananjay	fff4dd6311	Fix rpc limits version 2 (#3146 ) ## Issue Addressed N/A ## Proposed Changes https://github.com/sigp/lighthouse/pull/3133 changed the rpc type limits to be fork aware i.e. if our current fork based on wall clock slot is Altair, then we apply only altair rpc type limits. This is a bug because phase0 blocks can still be sent over rpc and phase 0 block minimum size is smaller than altair block minimum size. So a phase0 block with `size < SIGNED_BEACON_BLOCK_ALTAIR_MIN` will return an `InvalidData` error as it doesn't pass the rpc types bound check. This error can be seen when we try syncing pre-altair blocks with size smaller than `SIGNED_BEACON_BLOCK_ALTAIR_MIN`. This PR fixes the issue by also accounting for forks earlier than current_fork in the rpc limits calculation in the `rpc_block_limits_by_fork` function. I decided to hardcode the limits in the function because that seemed simpler than calculating previous forks based on current fork and doing a min across forks. Adding a new fork variant is simple and can the limits can be easily checked in a review. Adds unit tests and modifies the syncing simulator to check the syncing from across fork boundaries. The syncing simulator's block 1 would always be of phase 0 minimum size (404 bytes) which is smaller than altair min block size (since block 1 contains no attestations).	2022-04-07 23:45:38 +00:00
Pawan Dhananjay	ab434bc075	Fix merge rpc length limits (#3133 ) ## Issue Addressed N/A ## Proposed Changes Fix the upper bound for blocks by root responses to be equal to the max merge block size instead of altair. Further make the rpc response limits fork aware.	2022-04-04 00:26:15 +00:00
Michael Sproul	41e7a07c51	Add `lighthouse db` command (#3129 ) ## Proposed Changes Add a `lighthouse db` command with three initial subcommands: - `lighthouse db version`: print the database schema version. - `lighthouse db migrate --to N`: manually upgrade (or downgrade!) the database to a different version. - `lighthouse db inspect --column C`: log the key and size in bytes of every value in a given `DBColumn`. This PR lays the groundwork for other changes, namely: - Mark's fast-deposit sync (https://github.com/sigp/lighthouse/pull/2915), for which I think we should implement a database downgrade (from v9 to v8). - My `tree-states` work, which already implements a downgrade (v10 to v8). - Standalone purge commands like `lighthouse db purge-dht` per https://github.com/sigp/lighthouse/issues/2824. ## Additional Info I updated the `strum` crate to 0.24.0, which necessitated some changes in the network code to remove calls to deprecated methods. Thanks to @winksaville for the motivation, and implementation work that I used as a source of inspiration (https://github.com/sigp/lighthouse/pull/2685).	2022-04-01 00:58:59 +00:00
Divma	4bf1af4e85	Custom RPC request management for sync (#3029 ) ## Proposed Changes Make `lighthouse_network` generic over request ids, now usable by sync	2022-03-02 22:07:17 +00:00
Age Manning	e88b18be09	Update libp2p (#3039 ) Update libp2p. This corrects some gossipsub metrics.	2022-03-02 05:09:52 +00:00
Age Manning	f3c1dde898	Filter non global ips from discovery (#3023 ) ## Issue Addressed #3006 ## Proposed Changes This PR changes the default behaviour of lighthouse to ignore discovered IPs that are not globally routable. It adds a CLI flag, --enable-local-discovery to permit the non-global IPs in discovery. NOTE: We should take care in merging this as I will break current set-ups that rely on local IP discovery. I made this the non-default behaviour because we dont really want to be wasting resources attempting to connect to non-routable addresses and we dont want to propagate these to others (on the chance we can connect to one of these local nodes), improving discoveries efficiency.	2022-03-02 03:14:27 +00:00
Age Manning	a1b730c043	Cleanup small issues (#3027 ) Downgrades some excessive networking logs and corrects some metrics.	2022-03-01 01:49:22 +00:00
Michael Sproul	5e1f8a8480	Update to Rust 1.59 and 2021 edition (#3038 ) ## Proposed Changes Lots of lint updates related to `flat_map`, `unwrap_or_else` and string patterns. I did a little more creative refactoring in the op pool, but otherwise followed Clippy's suggestions. ## Additional Info We need this PR to unblock CI.	2022-02-25 00:10:17 +00:00
Age Manning	3ebb8b0244	Improved peer management (#2993 ) ## Issue Addressed I noticed in some logs some excess and unecessary discovery queries. What was happening was we were pruning our peers down to our outbound target and having some disconnect. When we are below this threshold we try to find more peers (even if we are at our peer limit). The request becomes futile because we have no more peer slots. This PR corrects this issue and advances the pruning mechanism to favour subnet peers. An overview the new logic added is: - We prune peers down to a target outbound peer count which is higher than the minimum outbound peer count. - We only search for more peers if there is room to do so, and we are below the minimum outbound peer count not the target. So this gives us some buffer for peers to disconnect. The buffer is currently 10% The modified pruning logic is documented in the code but for reference it should do the following: - Prune peers with bad scores first - If we need to prune more peers, then prune peers that are subscribed to a long-lived subnet - If we still need to prune peers, the prune peers that we have a higher density of on any given subnet which should drive for uniform peers across all subnets. This will need a bit of testing as it modifies some significant peer management behaviours in lighthouse.	2022-02-18 02:36:43 +00:00
Paul Hauner	0a6a8ea3b0	Engine API v1.0.0.alpha.6 + interop tests (#3024 ) ## Issue Addressed NA ## Proposed Changes This PR extends #3018 to address my review comments there and add automated integration tests with Geth (and other implementations, in the future). I've also de-duplicated the "unused port" logic by creating an `common/unused_port` crate. ## Additional Info I'm not sure if we want to merge this PR, or update #3018 and merge that. I don't mind, I'm primarily opening this PR to make sure CI works. Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-02-17 21:47:06 +00:00
Divma	1306b2db96	libp2p upgrade + gossipsub interval fix (#3012 ) ## Issue Addressed Lighthouse gossiping late messages ## Proposed Changes Point LH to our fork using tokio interval, which 1) works as expected 2) is more performant than the previous version that actually worked as expected Upgrade libp2p ## Additional Info https://github.com/libp2p/rust-libp2p/issues/2497	2022-02-10 04:12:03 +00:00
Divma	36fc887a40	Gossip cache timeout adjustments (#2997 ) ## Proposed Changes - Do not retry to publish sync committee messages. - Give a more lenient timeout to slashings and exits	2022-02-07 23:25:06 +00:00
Age Manning	675c7b7e26	Correct a dial race condition (#2992 ) ## Issue Addressed On a network with few nodes, it is possible that the same node can be found from a subnet discovery and a normal peer discovery at the same time. The network behaviour loads these peers into events and processes them when it has the chance. It can happen that the same peer can enter the event queue more than once and then attempt to be dialed twice. This PR shifts the registration of nodes in the peerdb as being dialed before they enter the NetworkBehaviour queue, preventing multiple attempts of the same peer being entered into the queue and avoiding the race condition.	2022-02-07 23:25:05 +00:00
Divma	48b7c8685b	upgrade libp2p (#2933 ) ## Issue Addressed Upgrades libp2p to v.0.42.0 pre release (https://github.com/libp2p/rust-libp2p/pull/2440)	2022-02-07 23:25:03 +00:00
Divma	615695776e	Retry gossipsub messages when insufficient peers (#2964 ) ## Issue Addressed #2947 ## Proposed Changes Store messages that fail to be published due to insufficient peers for retry later. Messages expire after half an epoch and are retried if gossipsub informs us that an useful peer has connected. Currently running in Atlanta ## Additional Info If on retry sending the messages fails they will not be tried again	2022-02-03 01:12:30 +00:00
Mac L	286996b090	Fix small typo in error log (#2975 ) ## Proposed Changes Fixes a small typo I came across.	2022-01-31 22:55:07 +00:00
Age Manning	bdd70d7aef	Reduce gossip history (#2969 ) The gossipsub history was increased to a good portion of a slot from 2.1 seconds in the last release. Although it shouldn't cause too much issue, it could be related to recieving later messages than usual and interacting with our scoring system penalizing peers. For consistency, this PR reduces the time we gossip messages back to the same values of the previous release. It also adjusts the gossipsub heartbeat time for testing purposes with a developer flag but this should not effect end users.	2022-01-31 07:29:41 +00:00
Age Manning	ca29b580a2	Increase target subnet peers (#2948 ) In the latest release we decreased the target number of subnet peers. It appears this could be causing issues in some cases and so reverting it back to the previous number it wise. A larger PR that follows this will address some other related discovery issues and peer management around subnet peer discovery.	2022-01-24 12:08:00 +00:00
Age Manning	fc7a1a7dc7	Allow disconnected states to introduce new peers without warning (#2922 ) ## Issue Addressed We emit a warning to verify that all peer connection state information is consistent. A warning is given under one edge case; We try to dial a peer with peer-id X and multiaddr Y. The peer responds to multiaddr Y with a different peer-id, Z. The dialing to the peer fails, but libp2p injects the failed attempt as peer-id Z. In this instance, our PeerDB tries to add a new peer in the disconnected state under a previously unknown peer-id. This is harmless and so this PR permits this behaviour without logging a warning.	2022-01-20 09:14:25 +00:00
Age Manning	1c667ad3ca	PeerDB Status unknown bug fix (#2907 ) ## Issue Addressed The PeerDB was getting out of sync with the number of disconnected peers compared to the actual count. As this value determines how many we store in our cache, over time the cache was depleting and we were removing peers immediately resulting in errors that manifest as unknown peers for some operations. The error occurs when dialing a peer fails, we were not correctly updating the peerdb counter because the increment to the counter was placed in the wrong order and was therefore not incrementing the count. This PR corrects this.	2022-01-14 05:42:48 +00:00
Age Manning	6f4102aab6	Network performance tuning (#2608 ) There is a pretty significant tradeoff between bandwidth and speed of gossipsub messages. We can reduce our bandwidth usage considerably at the cost of minimally delaying gossipsub messages. The impact of delaying messages has not been analyzed thoroughly yet, however this PR in conjunction with some gossipsub updates show considerable bandwidth reduction. This PR allows the user to set a CLI value (`network-load`) which is an integer in the range of 1 of 5 depending on their bandwidth appetite. 1 represents the least bandwidth but slowest message recieving and 5 represents the most bandwidth and fastest received message time. For low-bandwidth users it is likely to be more efficient to use a lower value. The default is set to 3, which currently represents a reduced bandwidth usage compared to previous version of this PR. The previous lighthouse versions are equivalent to setting the `network-load` CLI to 4. This PR is awaiting a few gossipsub updates before we can get it into lighthouse.	2022-01-14 05:42:47 +00:00
Paul Hauner	aaa5344eab	Add peer score adjustment msgs (#2901 ) ## Issue Addressed N/A ## Proposed Changes This PR adds the `msg` field to `Peer score adjusted` log messages. These `msg` fields help identify why a peer was banned. Example: ``` Jan 11 04:18:48.096 DEBG Peer score adjusted score: -100.00, peer_id: 16Uiu2HAmQskxKWWGYfginwZ51n5uDbhvjHYnvASK7PZ5gBdLmzWj, msg: attn_unknown_head, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -27.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -100.00, peer_id: 16Uiu2HAmQskxKWWGYfginwZ51n5uDbhvjHYnvASK7PZ5gBdLmzWj, msg: attn_unknown_head, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -28.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p Jan 11 04:18:48.096 DEBG Peer score adjusted score: -29.86, peer_id: 16Uiu2HAmA7cCb3MemVDbK3MHZoSb7VN3cFUG3vuSZgnGesuVhPDE, msg: sync_past_slot, service: libp2p ``` There is also a `libp2p_report_peer_msgs_total` metrics which allows us to see count of reports per `msg` tag. ## Additional Info NA	2022-01-12 05:32:14 +00:00
Age Manning	81c667b58e	Additional networking metrics (#2549 ) Adds additional metrics for network monitoring and evaluation. Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2021-12-22 06:17:14 +00:00
Divma	56d596ee42	Unban peers at the swarm level when purged (#2855 ) ## Issue Addressed #2840	2021-12-20 23:45:21 +00:00
Divma	eee0260a68	do not count dialing peers in the connection limit (#2856 ) ## Issue Addressed #2841 ## Proposed Changes Not counting dialing peers while deciding if we have reached the target peers in case of outbound peers. ## Additional Info Checked this running in nodes and bandwidth looks normal, peer count looks normal too	2021-12-15 05:48:45 +00:00
Lion - dapplion	2984f4b474	Remove wrong duplicated comment (#2751 ) ## Issue Addressed Remove wrong duplicated comment. Comment was copied from ban_peer() but doesn't apply to unban_peer()	2021-12-06 05:34:15 +00:00
Pawan Dhananjay	f3c237cfa0	Restrict network limits based on merge fork epoch (#2839 )	2021-12-02 14:32:31 +11:00
Paul Hauner	ab86b42874	Kintsugi Diva comments (#2836 ) * Remove TODOs * Fix typo	2021-12-02 14:29:59 +11:00
pawan	44a7b37ce3	Increase network limits (#2796 ) Fix max packet sizes Fix max_payload_size function Add merge block test Fix max size calculation; fix up test Clear comments Add a payload_size_function Use safe arith for payload calculation Return an error if block too big in block production Separate test to check if block is over limit	2021-12-02 14:29:20 +11:00
Mark Mackey	5687c56d51	Initial merge changes Added Execution Payload from Rayonism Fork Updated new Containers to match Merge Spec Updated BeaconBlockBody for Merge Spec Completed updating BeaconState and BeaconBlockBody Modified ExecutionPayload<T> to use Transaction<T> Mostly Finished Changes for beacon-chain.md Added some things for fork-choice.md Update to match new fork-choice.md/fork.md changes ran cargo fmt Added Missing Pieces in eth2_libp2p for Merge fix ef test Various Changes to Conform Closer to Merge Spec	2021-12-02 14:26:50 +11:00
Age Manning	6625aa4afe	Status'd Peer Not Found (#2761 ) ## Issue Addressed Users are experiencing `Status'd peer not found` errors ## Proposed Changes Although I cannot reproduce this error, this is only one connection state change that is not addressed in the peer manager (that I could see). The error occurs because the number of disconnected peers in the peerdb becomes out of sync with the actual number of disconnected peers. From what I can tell almost all possible connection state changes are handled, except for the case when a disconnected peer changes to be disconnecting. This can potentially happen at the peer connection limit, where a previously connected peer switches to disconnecting. This PR decrements the disconnected counter when this event occurs and from what I can tell, covers all possible disconnection state changes in the peer manager.	2021-11-28 22:46:17 +00:00
Pawan Dhananjay	9eedb6b888	Allow additional subnet peers (#2823 ) ## Issue Addressed N/A ## Proposed Changes 1. Don't disconnect peer from dht on connection limit errors 2. Bump up `PRIORITY_PEER_EXCESS` to allow for dialing upto 60 peers by default. Co-authored-by: Diva M <divma@protonmail.com>	2021-11-25 21:27:08 +00:00
Michael Sproul	2c07a72980	Revert peer DB changes from #2724 (#2828 ) ## Proposed Changes This reverts commit `53562010ec` from PR #2724 Hopefully this will restore the reliability of the sync simulator.	2021-11-25 03:45:52 +00:00
Age Manning	0b319d4926	Inform dialing via the behaviour (#2814 ) I had this change but it seems to have been lost in chaos of network upgrades. The swarm dialing event seems to miss some cases where we dial via the behaviour. This causes an error to be logged as the peer manager doesn't know about some dialing events. This shifts the logic to the behaviour to inform the peer manager.	2021-11-19 04:42:33 +00:00

1 2 3 4 5 ...

260 Commits