* changed name
* Fix sidecars
* update get blobs endpoint name from blobs to blob_sidecars
---------
Co-authored-by: Rahul Dogra <rahulcooldogra@gmail.com>
* some blob reprocessing work
* remove ForceBlockLookup
* reorder enum match arms in sync manager
* a lot more reprocessing work
* impl logic for triggerng blob lookups along with block lookups
* deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker.
* make single block lookup generic
* more work
* add delayed processing logic and combine some requests
* start fixing some compile errors
* fix compilation in main block lookup mod
* much work
* get things compiling
* parent blob lookups
* fix compile
* revert red/stevie changes
* fix up sync manager delay message logic
* add peer usefulness enum
* should remove lookup refactor
* consolidate retry error handling
* improve peer scoring during certain failures in parent lookups
* improve retry code
* drop parent lookup if either req has a peer disconnect during download
* refactor single block processed method
* processing peer refactor
* smol bugfix
* fix some todos
* fix lints
* fix lints
* fix compile in lookup tests
* fix lints
* fix lints
* fix existing block lookup tests
* renamings
* fix after merge
* cargo fmt
* compilation fix in beacon chain tests
* fix
* refactor lookup tests to work with multiple forks and response types
* make tests into macros
* wrap availability check error
* fix compile after merge
* add random blobs
* start fixing up lookup verify error handling
* some bug fixes and the start of deneb only tests
* make tests work for all forks
* track information about peer source
* error refactoring
* improve peer scoring
* fix test compilation
* make sure blobs are sent for processing after stream termination, delete copied tests
* add some tests and fix a bug
* smol bugfixes and moar tests
* add tests and fix some things
* compile after merge
* lots of refactoring
* retry on invalid block/blob
* merge unknown parent messages before current slot lookup
* get tests compiling
* penalize blob peer on invalid blobs
* Check disk on in-memory cache miss
* Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs
* Update beacon_node/network/src/sync/network_context.rs
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
* fix bug in matching blocks and blobs in range sync
* pr feedback
* fix conflicts
* upgrade logs from warn to crit when we receive incorrect response in range
* synced_and_connected_within_tolerance -> should_search_for_block
* remove todo
* add data gas used and update excess data gas to u64
* Fix Broken Overflow Tests
* payload verification with commitments
* fix merge conflicts
* restore payload file
* Restore payload file
* remove todo
* add max blob commitments per block
* c-kzg lib update
* Fix ef tests
* Abstract over minimal/mainnet spec in kzg crate
* Start integrating new KZG
* checkpoint sync without alignment
* checkpoint sync without alignment
* add import
* add import
* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)
* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)
* loosen check
* get state first and query by most recent block root
* Revert "loosen check"
This reverts commit 069d13dd63aa794a3505db9f17bd1a6b73f0be81.
* get state first and query by most recent block root
* merge max blobs change
* simplify delay logic
* rename unknown parent sync message variants
* rename parameter, block_slot -> slot
* add some docs to the lookup module
* use interval instead of sleep
* drop request if blocks and blobs requests both return `None` for `Id`
* clean up `find_single_lookup` logic
* add lookup source enum
* clean up `find_single_lookup` logic
* add docs to find_single_lookup_request
* move LookupSource our of param where unnecessary
* remove unnecessary todo
* query for block by `state.latest_block_header.slot`
* fix lint
* fix merge transition ef tests
* fix test
* fix test
* fix observed blob sidecars test
* Add some metrics (#33)
* fix protocol limits for blobs by root
* Update Engine API for 1:1 Structure Method
* make beacon chain tests to fix devnet 6 changes
* get ckzg working and fix some tests
* fix remaining tests
* fix lints
* Fix KZG linking issues
* remove unused dep
* lockfile
* test fixes
* remove dbgs
* remove unwrap
* cleanup tx generator
* small fixes
* fixing fixes
* more self reivew
* more self review
* refactor genesis header initialization
* refactor mock el instantiations
* fix compile
* fix network test, make sure they run for each fork
* pr feedback
* fix last test (hopefully)
---------
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
NA
## Proposed Changes
Implements the `PrettyReqwestError` to wrap a `reqwest::Error` and give nicer `Debug` formatting. It also wraps the `Url` component in a `SensitiveUrl` to avoid leaking sensitive info in logs.
### Before
```
Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(9999), path: "/eth/v1/node/version", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 61, kind: ConnectionRefused, message: "Connection refused" })) })
```
### After
```
HttpClient(url: http://localhost:9999/, kind: request, detail: error trying to connect: tcp connect error: Connection refused (os error 61))
```
## Additional Info
I've also renamed the `Reqwest` error enum variants to `HttpClient`, to give people a better chance at knowing what's going on. Reqwest is pretty odd and looks like a typo.
I've implemented it in the `eth2` and `execution_layer` crates. This should affect most logs in the VC and EE-related ones in the BN.
I think the last crate that could benefit from the is the `beacon_node/eth1` crate. I haven't updated it in this PR since its error type is not so amenable to it (everything goes into a `String`). I don't have a whole lot of time to jig around with that at the moment and I feel that this PR as it stands is a significant enough improvement to merge on its own. Leaving it as-is is fine for the time being and we can always come back for it later (or implement in-protocol deposits!).
## Issue Addressed
Resolves#3238
## Proposed Changes
Please list or describe the changes introduced by this PR.
## Additional Info
Please provide any additional information. For example, future considerations
or information useful for reviewers.
## Proposed Changes
Remove `max-skip-slots` checks when processing blocks.
This was legacy code which was previously used in the Medalla testnet to sync to the correct fork.
With the addition of checkpoint sync which allows us to sync to any arbitrary fork, this is no longer a necessary feature, so it has been removed for simplicity.
## Additional Notes
The CLI flag and checks for attestation processing have been retained as it still may have uses in DoS protection.
Currently, the ENR of the node may not be correctly updated when specifying ipv6 fields through the CLI if an ENR exists on disk.
This remedies a bug where we were not checking for ipv6 fields when comparing whether to use an on-disk ENR or updating based on CLI configuration parameters.
## Issue Addressed
Closes#4332
## Proposed Changes
Remove the `CountUnrealized` type, defaulting unrealized justification to _on_. This fixes the #4332 issue by ensuring that importing the same block to fork choice always results in the same outcome.
Finalized sync speed may be slightly impacted by this change, but that is deemed an acceptable trade-off until the optimisation from #4118 is implemented.
TODO:
- [x] Also check that the block isn't a duplicate before importing
* some blob reprocessing work
* remove ForceBlockLookup
* reorder enum match arms in sync manager
* a lot more reprocessing work
* impl logic for triggerng blob lookups along with block lookups
* deal with rpc blobs in groups per block in the da checker. don't cache missing blob ids in the da checker.
* make single block lookup generic
* more work
* add delayed processing logic and combine some requests
* start fixing some compile errors
* fix compilation in main block lookup mod
* much work
* get things compiling
* parent blob lookups
* fix compile
* revert red/stevie changes
* fix up sync manager delay message logic
* add peer usefulness enum
* should remove lookup refactor
* consolidate retry error handling
* improve peer scoring during certain failures in parent lookups
* improve retry code
* drop parent lookup if either req has a peer disconnect during download
* refactor single block processed method
* processing peer refactor
* smol bugfix
* fix some todos
* fix lints
* fix lints
* fix compile in lookup tests
* fix lints
* fix lints
* fix existing block lookup tests
* renamings
* fix after merge
* cargo fmt
* compilation fix in beacon chain tests
* fix
* refactor lookup tests to work with multiple forks and response types
* make tests into macros
* wrap availability check error
* fix compile after merge
* add random blobs
* start fixing up lookup verify error handling
* some bug fixes and the start of deneb only tests
* make tests work for all forks
* track information about peer source
* error refactoring
* improve peer scoring
* fix test compilation
* make sure blobs are sent for processing after stream termination, delete copied tests
* add some tests and fix a bug
* smol bugfixes and moar tests
* add tests and fix some things
* compile after merge
* lots of refactoring
* retry on invalid block/blob
* merge unknown parent messages before current slot lookup
* get tests compiling
* penalize blob peer on invalid blobs
* Check disk on in-memory cache miss
* Update beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs
* Update beacon_node/network/src/sync/network_context.rs
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
* fix bug in matching blocks and blobs in range sync
* pr feedback
* fix conflicts
* upgrade logs from warn to crit when we receive incorrect response in range
* synced_and_connected_within_tolerance -> should_search_for_block
* remove todo
* Fix Broken Overflow Tests
* fix merge conflicts
* checkpoint sync without alignment
* add import
* query for checkpoint state by slot rather than state root (teku doesn't serve by state root)
* get state first and query by most recent block root
* simplify delay logic
* rename unknown parent sync message variants
* rename parameter, block_slot -> slot
* add some docs to the lookup module
* use interval instead of sleep
* drop request if blocks and blobs requests both return `None` for `Id`
* clean up `find_single_lookup` logic
* add lookup source enum
* clean up `find_single_lookup` logic
* add docs to find_single_lookup_request
* move LookupSource our of param where unnecessary
* remove unnecessary todo
* query for block by `state.latest_block_header.slot`
* fix lint
* fix test
* fix test
* fix observed blob sidecars test
* PR updates
* use optional params instead of a closure
* create lookup and trigger request in separate method calls
* remove `LookupSource`
* make sure duplicate lookups are not dropped
---------
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
## Issue Addressed
Resolves#3980. Builds on work by @GeemoCandama in #4084
## Proposed Changes
Extends the `SupportedProtocol` abstraction added in Geemo's PR and attempts to fix internal versioning of requests that are mentioned in this comment https://github.com/sigp/lighthouse/pull/4084#issuecomment-1496380033
Co-authored-by: geemo <geemo@tutanota.com>
Done in different PRs so that they can reviewed independently, as it's likely this won't be merged before I leave
Includes resolution for #4080
- [ ] #4299
- [ ] #4318
- [ ] #4320
Co-authored-by: Diva M <divma@protonmail.com>
Co-authored-by: Age Manning <Age@AgeManning.com>
## Issue Addressed
This PR addresses issue https://github.com/sigp/lighthouse/issues/4350
## Proposed Changes
This change will enable slasher broadcast in the following cases:
No flag is passed,
`--slasher-broadcast` is passed and,
`--slasher-broadcast=true` is passed.
Only when an explicit false value is passed the slasher does not broadcast.(`--slasher-broadcast=false`).
## Additional Info
TODO
- [x] Modify CLI parsing logic
- [x] Write test
Refer to #4353
Co-authored-by: Rahul Dogra <rahulcooldogra@gmail.com>
Co-authored-by: Gua00va <105484243+Gua00va@users.noreply.github.com>
## Issue Addressed
Closes#4354Closes#3987
Replaces #4305, #4283
## Proposed Changes
This switches the default slasher backend _back_ to LMDB.
If an MDBX database exists and the MDBX backend is enabled then MDBX will continue to be used. Our release binaries and Docker images will continue to include MDBX for as long as it is practical, so users of these should not notice any difference.
The main benefit is to users compiling from source and devs running tests. These users no longer have to struggle to compile MDBX and deal with the compatibility issues that arises. Similarly, devs don't need to worry about toggling feature flags in tests or risk forgetting to run the slasher tests due to backend issues.
## Issue Addressed
N/A
## Proposed Changes
This change will log the value of the relay block and the local block when the relay block is more profitable.
## Additional Info
This change will help validators understand the block selection (as it looks like the execution reward sometimes is higher that the MEV-reward).
The rationale for this change is to aid operators to better understand why a relay-block was chosen over a local block.
Looking at produced blocks (at beaconcha.in for example) it sometimes looks like the builder is making a profit just from the execution reward vs the MEV-reward, and creates the nagging question: "Could i have built this block and made that extra profit?"... The answer is probably "No, not without the extra transactions included by the relay", but by logging the value of the local block-candidate, this will no longer be an issue..
### Example (Mainnet)
https://beaconcha.in/block/17370329
MEV Block Reward: 0.17122 Ether to 0xE35bBaFa0266089f95d745d348b468622805D82B
Execution Reward: 0.17528 Ether to 0x1f9090aaE28b8a3dCeaDf281B0F12828e676c326
Difference: 0.00406 Ether
### Examples (Goerli)
https://goerli.beaconcha.in/block/9040065
MEV Block Reward: 0.56423 Ether to 0xF5794543CF6055Ae710E9c8E99E31343Cea004a8
Execution Reward: 0.56488 Ether to 0xfC0157aA4F5DB7177830ACddB3D5a9BB5BE9cc5e
Difference: 0.00065 Ether
https://goerli.beaconcha.in/block/9019921
MEV Block Reward: 1.39440 Ether to 0xF5794543CF6055Ae710E9c8E99E31343Cea004a8
Execution Reward: 1.39469 Ether to 0xfC0157aA4F5DB7177830ACddB3D5a9BB5BE9cc5e
Difference: 0.00029 Ether
https://goerli.beaconcha.in/block/9015583
MEV Block Reward: 1.04356 Ether to 0xF5794543CF6055Ae710E9c8E99E31343Cea004a8
Execution Reward: 1.04896 Ether to 0xfC0157aA4F5DB7177830ACddB3D5a9BB5BE9cc5e
Difference: 0.0054 Ether
## Issue Addressed
On deneb devnetv5, lighthouse keeps rate limiting peers which makes it harder to bootstrap new nodes as there are very few peers in the network. This PR adds an option to disable the inbound rate limiter for testnets.
Added an option to configure inbound rate limits as well.
Co-authored-by: Diva M <divma@protonmail.com>
## Proposed Changes
This is a light refactor of the execution layer's block hash calculation logic making it easier to use externally. e.g. in `eleel` (https://github.com/sigp/eleel/pull/18).
A static method is preferable to a method because the calculation doesn't actually need any data from `self`, and callers may want to compute block hashes without constructing an `ExecutionLayer` (`eleel` only constructs a simpler `Engine` struct).
This PR address the following spec change: https://github.com/ethereum/consensus-specs/pull/3312
Instead of subscribing to a long-lived subnet for every attached validator to a beacon node, all beacon nodes will subscribe to `SUBNETS_PER_NODE` long-lived subnets. This is currently set to 2 for mainnet.
This PR does not include any scoring or advanced discovery mechanisms. A future PR will improve discovery and we can implement scoring after the next hard fork when we expect all client teams and all implementations to respect this spec change.
This will be a significant change in the subnet network structure for consensus clients and we will likely have to monitor and tweak our peer management logic.
This PR adds the ability to read the Lighthouse logs from the HTTP API for both the BN and the VC.
This is done in such a way to as minimize any kind of performance hit by adding this feature.
The current design creates a tokio broadcast channel and mixes is into a form of slog drain that combines with our main global logger drain, only if the http api is enabled.
The drain gets the logs, checks the log level and drops them if they are below INFO. If they are INFO or higher, it sends them via a broadcast channel only if there are users subscribed to the HTTP API channel. If not, it drops the logs.
If there are more than one subscriber, the channel clones the log records and converts them to json in their independent HTTP API tasks.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
NA
## Proposed Changes
Adds metrics to track validators that are submitting equivocating (but not slashable) sync messages. This follows on from some research we've been doing in a separate fork of LH.
## Additional Info
@jimmygchen and @michaelsproul have already run their eyes over this so it should be easy to get into v4.2.0, IMO.
## Issue Addressed
#4281
## Proposed Changes
- Change `ShufflingCache` implementation from using `LruCache` to a custom cache that removes entry with lowest epoch instead of oldest insertion time.
- Protect the "enshrined" head shufflings when inserting new committee cache entries. The shuffling ids matching the head's previous, current, and future epochs will never be ejected from the cache during `Self::insert_cache_item`.
## Additional Info
There is a bonus point on shuffling preferences in the issue description that hasn't been implemented yet, as I haven't figured out a good way to do this:
> However I'm not convinced since there are some complexities around tie-breaking when two entries have the same epoch. Perhaps preferring entries in the canonical chain is best?
We should be able to check if a block is on the canonical chain by:
```rust
canonical_head
.fork_choice_read_lock()
.contains_block(root)
```
However we need to interleave the shuffling and fork choice locks, which may cause deadlocks if we're not careful (mentioned by @paulhauner). Alternatively, we could use the `state.block_roots` field of the `chain.canonical_head.snapshot.beacon_state`, which avoids deadlock but requires more work.
I'd like to get some feedback on review & testing before I dig deeper into the preferences stuff, as having the canonical head preference may already be quite useful in preventing the issue raised.
Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613.
## Proposed Changes
- Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if:
- The EL's internal status is `Offline` or `AuthFailed`, _or_
- The most recent call to `newPayload` resulted in an error (more on this in a moment).
- Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes.
- Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated.
- Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check.
## Why track `newPayload` errors?
Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because:
- If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)), which would have the effect of masking the failure and keeping the status _online_.
- The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms).
- If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them.
## Additional Changes
- Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too.
- Delete some stale comments from when we used to support multiple execution nodes.
## Issue Addressed
#2335
## Proposed Changes
- Remove the `lighthouse-network::tests::gossipsub_tests` module
- Remove dead code from the `lighthouse-network::tests::common` helper module (`build_full_mesh`)
## Additional Info
After discussion with both @divagant-martian and @AgeManning, these tests seem to have two main issues in that they are:
- Redundant, in that they don't test anything meaningful (due to our handling of duplicate messages)
- Out-of-place, in that it doesn't really test Lighthouse-specific functionality (rather libp2p functionality)
As such, this PR supersedes #4286.
## Issue Addressed
NA
## Proposed Changes
Adds an additional check to a feature introduced in #4179 to prevent us from re-queuing already-known blocks that could be rejected immediately.
## Additional Info
Ideally this would have been included in v4.1.0, however we came across it too late to release it safely. We decided that the safest path forward is to release *without* this check and then patch it in the next version. The lack of this check should only result in a very minor performance impact (the impact is totally negligible in my assessment).