## Issue Addressed
Solves #3390
So after checking some logs @pawanjay176 got, we conclude that this happened because we blacklisted a chain after trying it "too much". Now here, in all occurrences it seems that "too much" means we got too many download failures. This happened very slowly, exactly because the batch is allowed to stay alive for very long times after not counting penalties when the ee is offline. The error here then was not that the batch failed because of offline ee errors, but that we blacklisted a chain because of download errors, which we can't pin on the chain but on the peer. This PR fixes that.
## Proposed Changes
Adds a missing piece of logic so that if a chain fails for errors that can't be attributed to an objectively bad behavior from the peer, it is not blacklisted. The issue at hand occurred when new peers arrived claiming a head that had wrongfully blacklisted, even if the original peers participating in the chain were not penalized.
Another notable change is that we need to consider a batch invalid if it processed correctly but its next non empty batch fails processing. Now since a batch can fail processing in non empty ways, there is no need to mark as invalid previous batches.
Improves some logging as well.
## Additional Info
We should do this regardless of pausing sync on ee offline/unsynced state. This is because I think it's almost impossible to ensure a processing result will reach in a predictable order with a synced notification from the ee. Doing this handles what I think are inevitable data races when we actually pause sync
This also fixes a return that reports which batch failed and caused us some confusion checking the logs
## Issue Addressed
Removes the await points in sync waiting for a processor response for rpc block processing. Built on top of #3029
This also handles a couple of bugs in the previous code and adds a relatively comprehensive test suite.
## Issue Addressed
Closes#1891Closes#1784
## Proposed Changes
Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint.
## Additional Info
- [x] Return unavailable status for out-of-range blocks requested by peers (#2561)
- [x] Implement sync daemon for fetching historical blocks (#2561)
- [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module)
- [x] Consistency check for initial block + state
- [x] Fetch the initial state and block from a beacon node HTTP endpoint
- [x] Don't crash fetching beacon states by slot from the API
- [x] Background service for state reconstruction, triggered by CLI flag or API call.
Considered out of scope for this PR:
- Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification)
Co-authored-by: Diva M <divma@protonmail.com>
## Issue Addressed
#1856
## Proposed Changes
- For clarity, the router's processor now only decides if a peer is compatible and it disconnects it or sends it to sync accordingly. No logic here regarding how useful is the peer.
- Update peer_sync_info's rules
- Add an `IrrelevantPeer` sync status to account for incompatible peers (maybe this should be "IncompatiblePeer" now that I think about it?) this state is update upon receiving an internal goodbye in the peer manager
- Misc code cleanups
- Reduce the need to create `StatusMessage`s (and thus, `Arc` accesses )
- Add missing calls to update the global sync state
The overall effect should be:
- More peers recognized as Behind, and less as Unknown
- Peers identified as incompatible
The changes are somewhat simple but should solve two issues:
- When quickly changing between chains once and a second time back again, batchIds would collide and cause havoc.
- If we got an out of range response from a peer, sync would remain in syncing but without advancing
Changes:
- remove the batch id. Identify each batch (inside a chain) by its starting epoch. Target epochs for downloading and processing now advance by EPOCHS_PER_BATCH
- for the same reason, move the "to_be_downloaded_id" to be an epoch
- remove a sneaky line that dropped an out of range batch without downloading it
- bonus: put the chain_id in the log given to the chain. This is why explicitly logging the chain_id is removed
## Issue Addressed
NA
## Proposed Changes
Moves beacon block processing over to the newly-added `GossipProcessor`. This moves the task off the core executor onto the blocking one.
## Additional Info
- With this PR, gossip blocks are being ignored during sync.
* wip: mwake the request id optional
* make the request_id optional
* cleanup
* address clippy lints inside rpc
* WIP: Separate sent RPC events from received ones
* WIP: Separate sent RPC events from received ones
* cleanup
* Separate request ids from substream ids
* Make RPC's message handling independent of RequestIds
* Change behaviour RPC events to be more outside-crate friendly
* Propage changes across the network + router + processor
* Propage changes across the network + router + processor
* fmt
* "tiny" refactor
* more tiny refactors
* fmt eth2-libp2p
* wip: propagating changes
* wip: propagating changes
* cleaning up
* more cleanup
* fmt
* tests HOT fix
Co-authored-by: Age Manning <Age@AgeManning.com>
* Prevent duplicate parent block lookups
* Updates logic for handling re-status'd peers
* Allow block lookup if the block is close to head
* Correct ordering of sync logs
* Remove comments in block processer, clean up sim
* Upgrade the parent lookup logic
* Apply reviewer suggestions
* move the parent lookup process to a dedicated thread
* move the logic of parent lookup and range syncing to a block processor
* review suggestions
* more review suggestions
* Add small logging changes
* Process parent lookups in reverse
Co-authored-by: Age Manning <Age@AgeManning.com>
* Remove ping protocol
* Initial renaming of network services
* Correct rebasing relative to latest master
* Start updating types
* Adds HashMapDelay struct to utils
* Initial network restructure
* Network restructure. Adds new types for v0.2.0
* Removes build artefacts
* Shift validation to beacon chain
* Temporarily remove gossip validation
This is to be updated to match current optimisation efforts.
* Adds AggregateAndProof
* Begin rebuilding pubsub encoding/decoding
* Signature hacking
* Shift gossipsup decoding into eth2_libp2p
* Existing EF tests passing with fake_crypto
* Shifts block encoding/decoding into RPC
* Delete outdated API spec
* All release tests passing bar genesis state parsing
* Update and test YamlConfig
* Update to spec v0.10 compatible BLS
* Updates to BLS EF tests
* Add EF test for AggregateVerify
And delete unused hash2curve tests for uncompressed points
* Update EF tests to v0.10.1
* Use optional block root correctly in block proc
* Use genesis fork in deposit domain. All tests pass
* Fast aggregate verify test
* Update REST API docs
* Fix unused import
* Bump spec tags to v0.10.1
* Add `seconds_per_eth1_block` to chainspec
* Update to timestamp based eth1 voting scheme
* Return None from `get_votes_to_consider` if block cache is empty
* Handle overflows in `is_candidate_block`
* Revert to failing tests
* Fix eth1 data sets test
* Choose default vote according to spec
* Fix collect_valid_votes tests
* Fix `get_votes_to_consider` to choose all eligible blocks
* Uncomment winning_vote tests
* Add comments; remove unused code
* Reduce seconds_per_eth1_block for simulation
* Addressed review comments
* Add test for default vote case
* Fix logs
* Remove unused functions
* Meter default eth1 votes
* Fix comments
* Progress on attestation service
* Address review comments; remove unused dependency
* Initial work on removing libp2p lock
* Add LRU caches to store (rollup)
* Update attestation validation for DB changes (WIP)
* Initial version of should_forward_block
* Scaffold
* Progress on attestation validation
Also, consolidate prod+testing slot clocks so that they share much
of the same implementation and can both handle sub-slot time changes.
* Removes lock from libp2p service
* Completed network lock removal
* Finish(?) attestation processing
* Correct network termination future
* Add slot check to block check
* Correct fmt issues
* Remove Drop implementation for network service
* Add first attempt at attestation proc. re-write
* Add version 2 of attestation processing
* Minor fixes
* Add validator pubkey cache
* Make get_indexed_attestation take a committee
* Link signature processing into new attn verification
* First working version
* Ensure pubkey cache is updated
* Add more metrics, slight optimizations
* Clone committee cache during attestation processing
* Update shuffling cache during block processing
* Remove old commented-out code
* Fix shuffling cache insert bug
* Used indexed attestation in fork choice
* Restructure attn processing, add metrics
* Add more detailed metrics
* Tidy, fix failing tests
* Fix failing tests, tidy
* Address reviewers suggestions
* Disable/delete two outdated tests
* Modification of validator for subscriptions
* Add slot signing to validator client
* Further progress on validation subscription
* Adds necessary validator subscription functionality
* Add new Pubkeys struct to signature_sets
* Refactor with functional approach
* Update beacon chain
* Clean up validator <-> beacon node http types
* Add aggregator status to ValidatorDuty
* Impl Clone for manual slot clock
* Fix minor errors
* Further progress validator client subscription
* Initial subscription and aggregation handling
* Remove decompressed member from pubkey bytes
* Progress to modifying val client for attestation aggregation
* First draft of validator client upgrade for aggregate attestations
* Add hashmap for indices lookup
* Add state cache, remove store cache
* Only build the head committee cache
* Removes lock on a network channel
* Partially implement beacon node subscription http api
* Correct compilation issues
* Change `get_attesting_indices` to use Vec
* Fix failing test
* Partial implementation of timer
* Adds timer, removes exit_future, http api to op pool
* Partial multiple aggregate attestation handling
* Permits bulk messages accross gossipsub network channel
* Correct compile issues
* Improve gosispsub messaging and correct rest api helpers
* Added global gossipsub subscriptions
* Update validator subscriptions data structs
* Tidy
* Re-structure validator subscriptions
* Initial handling of subscriptions
* Re-structure network service
* Add pubkey cache persistence file
* Add more comments
* Integrate persistence file into builder
* Add pubkey cache tests
* Add HashSetDelay and introduce into attestation service
* Handles validator subscriptions
* Add data_dir to beacon chain builder
* Remove Option in pubkey cache persistence file
* Ensure consistency between datadir/data_dir
* Fix failing network test
* Peer subnet discovery gets queued for future subscriptions
* Reorganise attestation service functions
* Initial wiring of attestation service
* First draft of attestation service timing logic
* Correct minor typos
* Tidy
* Fix todos
* Improve tests
* Add PeerInfo to connected peers mapping
* Fix compile error
* Fix compile error from merge
* Split up block processing metrics
* Tidy
* Refactor get_pubkey_from_state
* Remove commented-out code
* Rename state_cache -> checkpoint_cache
* Rename Checkpoint -> Snapshot
* Tidy, add comments
* Tidy up find_head function
* Change some checkpoint -> snapshot
* Add tests
* Expose max_len
* Remove dead code
* Tidy
* Fix bug
* Add sync-speed metric
* Add first attempt at VerifiableBlock
* Start integrating into beacon chain
* Integrate VerifiableBlock
* Rename VerifableBlock -> PartialBlockVerification
* Add start of typed methods
* Add progress
* Add further progress
* Rename structs
* Add full block verification to block_processing.rs
* Further beacon chain integration
* Update checks for gossip
* Add todo
* Start adding segement verification
* Add passing chain segement test
* Initial integration with batch sync
* Minor changes
* Tidy, add more error checking
* Start adding chain_segment tests
* Finish invalid signature tests
* Include single and gossip verified blocks in tests
* Add gossip verification tests
* Start adding docs
* Finish adding comments to block_processing.rs
* Rename block_processing.rs -> block_verification
* Start removing old block processing code
* Fixes beacon_chain compilation
* Fix project-wide compile errors
* Remove old code
* Correct code to pass all tests
* Fix bug with beacon proposer index
* Fix shim for BlockProcessingError
* Only process one epoch at a time
* Fix loop in chain segment processing
* Correct tests from master merge
* Add caching for state.eth1_data_votes
* Add BeaconChain::validator_pubkey
* Revert "Add caching for state.eth1_data_votes"
This reverts commit cd73dcd6434fb8d8e6bf30c5356355598ea7b78e.
Co-authored-by: Grant Wuerker <gwuerker@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
* Create libp2p instance
* Change logger to stdlog
* test_connection initial commit
* Add gossipsub test
* Delete tests in network crate
* Add test module
* Clean tests
* Remove dependency on discovery
* Working publish between 2 nodes
TODO: Publish should be called just once
* Working 2 peer gossipsub test with additional events
* Cleanup test
* Add rpc test
* Star topology discovery WIP
* build_nodes builds and connects n nodes. Increase nodes in gossipsub test
* Add unsubscribe method and expose reference to gossipsub object for gossipsub tests
* Add gossipsub message forwarding test
* Fix gossipsub forward test
* Test improvements
* Remove discovery tests
* Simplify gossipsub forward test topology
* Add helper functions for topology building
* Clean up tests
* Update naming to new network spec
* Correct ssz encoding of protocol names
* Further additions to network upgrade
* Initial network spec update WIP
* Temp commit
* Builds one side of the streamed RPC responses
* Temporary commit
* Propagates streaming changes up into message handler
* Intermediate network update
* Partial update in upgrading to the new network spec
* Update dependencies, remove redundant deps
* Correct sync manager for block stream handling
* Re-write of RPC handler, improves efficiency and corrects bugs
* Stream termination update
* Completed refactor of rpc handler
* Remove crates
* Correct compile issues associated with test merge
* Build basic tests and testing structure for eth2-libp2p
* Enhance RPC tests and add logging
* Complete RPC testing framework and STATUS test
* Decoding bug fixes, log improvements, stream test
* Clean up RPC handler logging
* Decoder bug fix, empty block stream test
* Add BlocksByRoot RPC test
* Add Goodbye RPC test
* Syncing and stream handling bug fixes and performance improvements
* Applies discv5 bug fixes
* Adds DHT IP filtering for lighthouse - currently disabled
* Adds randomized network propagation as a CLI arg
* Add disconnect functionality
* Adds attestation handling and parent lookup
* Adds RPC error handling for the sync manager
* Allow parent's blocks to be already processed
* Update github workflow
* Adds reviewer suggestions
Updates lighthouse to the latest networking spec
- Sync re-write (#496)
- Updates to the latest eth2 networking spec (#495)
- Libp2p updates and improvements