* Low hanging fruits
* Remove unnecessary todo
I think it's fine to not handle this since the calling functions handle the error.
No specific reason imo to handle it in the function as well.
* Rename BlobError to GossipBlobError
I feel this signified better what the error is for. The BlobError was only for failures when gossip
verifying a blob. We cannot get this error when doing rpc validation
* Remove the BlockError::BlobValidation variant
This error was only there to appease gossip verification before publish.
It's unclear how to peer score this error since this cannot actually occur during any
block verification flows.
This commit introuduces an additional error type BlockContentsError to better represent the
Error type
* Add docs for peer scoring (or lack thereof) of AvailabilityCheck errors
* I do not see a non-convoluted way of doing this. Okay to have some redundant code here
* Removing this to catch the failure red handed
* Fix compilation
* Cannot be deleted because some tests assume the trait impl
Also useful to have around for testing in the future imo
* Add some metrics and logs
* Only process `Imported` variant in sync_methods
The only additional thing for other variants that might be useful is logging. We can do that
later if required
* Convert to TryFrom
Not really sure where this would be used, but just did what the comment says.
Could consider just returning the Block variant for a deneb block in the From version
* Unlikely to change now
* This is fine as this is max_rpc_size per rpc chunk (for blobs, it would be 128kb max)
* Log count instead of individual blobs, can delete log later if it becomes too annoying.
* Add block production blob verification timer
* Extend block_straemer test to deneb
* Remove dbg statement
* Fix tests
* changed name
* Fix sidecars
* Added query type and parrameter
* added query struct and function
* added method
* improved filtering method
* added blob_sidecar_list_indexed to block_id
* minor blobqueryindex fix
* function and formatting fix
* minor function and naming fix
* minor changes
## Issue Addressed
NA
## Proposed Changes
Carries on from #4115, with the following modifications:
1. Self-hosted runners are only enabled if `github.repository == sigp/lighthouse`.
- This allows forks to still have Github-hosted CI.
- This gives us a method to switch back to Github-runners if we have extended downtime on self-hosted.
1. Does not remove any existing dependency builds for Github-hosted runners (e.g., installing the latest Rust).
1. Adds the `WATCH_HOST` environment variable which defines where we expect to find the postgres db in the `watch` tests. This should be set to `host.docker.internal` for the tests to pass on self-hosted runners.
## Additional Info
NA
Co-authored-by: antondlr <anton@delaruelle.net>
## Issue Addressed
n/a Noticed this while working on something else
## Proposed Changes
- leverage the appropriate types to avoid a bunch of `unwrap` and errors
## Additional Info
n/a
## Issue Addressed
Speed up CI by installing foundry with Github action instead of building Anvil from source.
Building anvil from source on GItHub hosted runners currently takes about 10 mins. Using the `foundry-toolchain` action to install only takes about 2 seconds.
## Issue Addressed
N/A
## Proposed Changes
Add lints for rust 1.71
[3789134](3789134ae2) is probably the one that needs most attention as it changes beacon state code. I changed the `is_in_inactivity_leak ` function to return a `ArithError` as not all consumers of that function work well with a `BeaconState::Error`.
## Issue Addressed
This PR attempts to workaround the recent frequent eth1 simulator failures caused by missing eth logs from Anvil.
> FailedToInsertDeposit(NonConsecutive { log_index: 1, expected: 0 })
This usually occurs at the beginning of the tests, and it guarantees a timeout after a few hours if this log shows up, and this is currently causing our CIs to fail quite frequently.
Example failure here: https://github.com/sigp/lighthouse/actions/runs/5525760195/jobs/10079736914
## Proposed Changes
The quick fix applied here adds a timeout to node startup and restarts the node again.
- Add a 60 seconds timeout to beacon node startup in eth1 simulator tests. It takes ~10 seconds on my machine, but could take longer on CI runners.
- Wrap the startup code in a retry function, that allows for 3 retries before returning an error.
## Additional Info
We should probably raise an issue under the Anvil GitHub repo there so this can be further investigated.
## Issue Addressed
Addresses an issue where CI could fail due to an nonexisting file error:
```
Run ./clean.sh
rm: cannot remove '/home/runner/.lighthouse/local-testnet/geth_datadir4/geth/fastcache.tmp.1549331618': No such file or directory
Error: Process completed with exit code 1.
```
This seems to happen quite frequently now, I'm not sure exactly why but perhaps worth trying suppressing the prompt?
https://github.com/sigp/lighthouse/actions/runs/5455027574/jobs/9925916159?pr=4463
## Proposed Changes
Replace `wget` in the EF-tests makefile with `curl`.
On macOS `curl` is pre-installed, and I found myself making this change to avoid installing `wget`.
The `-L` flag is used to follow redirects which is useful if a repo gets renamed, and more similar to `wget`'s default behaviour.
## Issue Addressed
Fixes occasional compilation errors with mev-rs (see #4456).
## Proposed Changes
- Update `mev-rs` to the latest version, which allows us to remove hacky `[patch]` sections
- Update the `axum` version used in `watch` so LH only uses a single version
## Issue Addressed
#4488
## Proposed Changes
- Remove all instances of the `required` modifier where we have a default value specified for a subcommand
## Additional Info
N/A
## Issue Addressed
When trying to run `eth1-sim` locally, the simulator doesn't start for me, and panicked due to duplicate arg names for `proposer-nodes` (using same arg names as `nodes`). Not sure why this isn't failing on CI but failing on mine 🤔
```
thread 'main' panicked at 'Argument short must be unique
thread 'main' panicked at 'Argument long must be unique
```
## Issue Addressed
Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API:
> {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]}
## Proposed Changes
The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls.
To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`.
## Additional Info
I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes:
- Spacing out database writes (less frequent, larger batches)
- Keeping a limited chain history with high availability, e.g. the last month in the hot database.
This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
## Issue Addressed
#4494
## Proposed Changes
- Remove explicit re-exports of various types to appease the new compiler lint
## Additional Info
It seems `warn(hidden_glob_reexports)` is the main culprit.
## Proposed Changes
* Add `lcli state-root` command for computing the hash tree root of a `BeaconState`.
* Add a `--network` flag which can be used instead of `--testnet-dir` to set the network, e.g. Mainnet, Goerli, Gnosis.
* Use the new network flag in `transition-blocks`, `skip-slots`, and `block-root`, which previously only supported mainnet.
* **BREAKING CHANGE** Remove the default value of `~/.lighthouse/testnet` from `--testnet-dir`. This may have made sense in previous versions where `lcli` was more testnet focussed, but IMO it is an unnecessary complication and foot-gun today.
*Replaces #4434. It is identical, but this PR has a smaller diff due to a curated commit history.*
## Issue Addressed
NA
## Proposed Changes
This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate.
This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`.
The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for:
1. HTTP API requests.
1. Scheduled tasks in the `BeaconChain` (e.g., state advance).
Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests).
## Additional Info
This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see #4462) can build upon these works to actually use the `BeaconProcessor` for more operations.
I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.