Attempting to improve our CI speeds as its recently been a pain point.
Major changes:
- Use a github action to pull stable/nightly rust rather than building it each run
- Shift test suite to `nexttest` https://github.com/nextest-rs/nextest for CI
UPDATE:
So I've iterated on some changes, and although I think its still not optimal I think this is a good base to start from. Some extra things in this PR:
- Shifted where we pull rust from. We're now using this thing: https://github.com/moonrepo/setup-rust . It's got some interesting cache's built in, but was not seeing the gains that Jimmy managed to get. In either case tho, it can pull rust, cargofmt, clippy, cargo nexttest all in < 5s. So I think it's worthwhile.
- I've grouped a few of the check-like tests into a single test called `code-test`. Although we were using github runners in parallel which may be faster, it just seems wasteful. There were like 4-5 tests, where we would pull lighthouse, compile it, then run an action, like clippy, cargo-audit or fmt. I've grouped these into a single action, so we only compile lighthouse once, then in each step we run the checks. This avoids compiling lighthouse like 5 times.
- Ive made doppelganger tests run on our local machines to avoid pulling foundry, building and making lcli which are all now baked into the images.
- We have sccache and do not incremental compile lighthouse
Misc bonus things:
- Cargo update
- Fix web3 signer openssl keys which is required after a cargo update
- Use mock_instant in an LRU cache test to avoid non-deterministic test
- Remove race condition in building web3signer tests
There's still some things we could improve on. Such as downloading the EF tests every run and the web3-signer binary, but I've left these to be out of scope of this PR. I think the above are meaningful improvements.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: antondlr <anton@delaruelle.net>
## Issue Addressed
#4675
## Proposed Changes
- Update local ENR (**only port numbers**) with local addresses received from libp2p (via `SwarmEvent::NewListenAddr`)
- Only use the zero port for CLI tests
## Additional Info
### See Also ###
- #4705
- #4402
- #4745
* Start testing blob pruning
* Get rid of unnecessary orphaned blob column
* Make random blob tests deterministic
* Test for pruning being blocked by finality
* Fix bugs and test fork boundary
* A few more tweaks to pruning conditions
* Tweak oldest_blob_slot semantics
* Test margin pruning
* Clean up some terminology and lints
* Schema migrations for v18
* Remove FIXME
* Prune blobs on finalization not every slot
* Fix more bugs + tests
* Address review comments
## Issue Addressed
Synchronize dependencies and edition on the workspace `Cargo.toml`
## Proposed Changes
with https://github.com/rust-lang/cargo/issues/8415 merged it's now possible to synchronize details on the workspace `Cargo.toml` like the metadata and dependencies.
By only having dependencies that are shared between multiple crates aligned on the workspace `Cargo.toml` it's easier to not miss duplicate versions of the same dependency and therefore ease on the compile times.
## Additional Info
this PR also removes the no longer required direct dependency of the `serde_derive` crate.
should be reviewed after https://github.com/sigp/lighthouse/pull/4639 get's merged.
closes https://github.com/sigp/lighthouse/issues/4651
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
#4635
## Proposed Changes
Wrap the `SignedVoluntaryExit` object in a `GenericResponse` container, adding an additional `data` layer, to ensure compliance with the key manager API specification.
The new response would look like this:
```json
{"data":{"message":{"epoch":"196868","validator_index":"505597"},"signature":"0xhexsig"}}
```
This is a backward incompatible change and will affect Siren as well.
## Issue Addressed
NA
## Proposed Changes
Bumps `quinn-proto` to address a QUIC-related vulnerability: https://rustsec.org/advisories/RUSTSEC-2023-0063
Fixes a `cargo audit` failure.
## Additional Info
NA
## Issue Addressed
Closes#4751
## Proposed Changes
Prevent `state_root_at_slot` and `block_root_at_slot` from erroring out due to a call to `self.slot()?` that fails before genesis. This fixes pre-genesis queries for:
- block at slot 0
- block by genesis block root
- state at slot 0
- state by genesis state root
- state at `finalized` tag
- state at `justified` tag
## Issue Addressed
#4531
## Proposed Changes
add SSZ support to the following block production endpoints:
GET /eth/v2/validator/blocks/{slot}
GET /eth/v1/validator/blinded_blocks/{slot}
## Additional Info
i updated a few existing tests to use ssz instead of writing completely new tests
## Issue Addressed
#4738
## Proposed Changes
See the above issue for details. Went with option #2 to use the async reqwest client in `Eth2NetworkConfig` and propagate the async-ness.
## Issue Addressed
We're OOM'ing on Docker builds on the Deneb branch https://github.com/sigp/lighthouse/issues/3929
Are we ok to self host automated docker builds?
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: antondlr <anton@delaruelle.net>
Fix local testnet to generate keys in the correct folders when `BN_COUNT` and `VC_COUNT` don't match.
The current script place the generated validator keys in validator folders based on the `BN_COUNT` config, e.g. `node_1/validators`, `node_2/validators`..etc. We should be using `VC_COUNT` here instead, otherwise the number of validator clients may not match the number of directories generated, and would result in either:
1. a VC not having any keys (when `BN_COUNT` < `VC_COUNT`)
2. a validator key directory not being used (when `BN_COUNT` > `VC_COUNT`).
There is an issue with the file `scripts/local_testnet/start_local_testnet.sh` - when we use a non-default `$SPEC-PRESET` in `vars.env` it runs into an error:
```
executing: ./setup.sh >> /home/ck/.lighthouse/local-testnet/testnet/setup.log
parse error: Invalid numeric literal at line 1, column 7
```
@jimmygchen found the issue and the updated script includes the flag `--spec $SPEC-PRESET`
## Proposed Changes
This PR adds more logging prior to genesis, particularly on networks that start with execution enabled.
There are new checks using `eth_getBlockByHash/Number` to verify that the genesis state's `latest_execution_payload_header` matches the execution node's genesis block.
The first commit also runs the merge-readiness/Capella-readiness checks prior to genesis. This has two effects:
- Give more information on the execution node's status and its readiness for genesis.
- Prevent the `el_offline` status from being set on `/eth/v1/node/syncing`, which previously caused the VC to complain loudly.
I would like to include this for the Holesky reboot. It would have caught the misconfig that doomed the first Holesky.
## Additional Info
- Geth doesn't serve payload bodies prior to genesis, which is why we use the legacy methods. I haven't checked with other ELs yet.
- Currently this is logging errors with _Capella_ genesis states generated by `ethereum-genesis-generator` because the `withdrawals_root` is not set correctly (it is 0x0). This is not a blocker for Holesky, as it starts from Bellatrix (Pari is investigating).
## Issue Addressed
I went through the code base and look for places where we acquire fork choice locks (after the deadlock bug was found and fixed in #4687), and discovered an instance where we re-acquire a lock immediately after dropping it. This shouldn't cause deadlock like the other issue, but is slightly less efficient.
## Issue Addressed
CI is plagued by `AddrAlreadyInUse` failures, which are caused by race conditions in allocating free ports.
This PR removes all usages of the `unused_port` crate for Lighthouse's HTTP API, in favour of passing `:0` as the listen address. As a result, the listen address isn't known ahead of time and must be read from the listening socket after it binds. This requires tying some self-referential knots, which is a little disruptive, but hopefully doesn't clash too much with Deneb 🤞
There are still a few usages of `unused_tcp4_port` left in cases where we start external processes, like the `watch` Postgres DB, Anvil, Geth, Nethermind, etc. Removing these usages is non-trivial because it's hard to read the port back from an external process after starting it with `--port 0`. We might be able to do something on Linux where we read from `/proc/`, but I'll leave that for future work.
* Implement `SignedBlockContent` decoding and fixed bug in `SignedBlockContent::new`
* Update Cargo.lock file
* Use `make_genesis_spec` to simplify test setup.
* Fix syntax errors.
## Issue Addressed
On a new network a user might require importing validators before waiting until genesis has occurred.
## Proposed Changes
Starts the validator client http api before waiting for genesis
## Additional Info
cc @antondlr
Web3Signer now requires Java runtime v17, see [v23.8.0 release](https://github.com/Consensys/web3signer/releases/tag/23.8.0).
We have some Web3Signer tests that requires a compatible Java runtime to be installed on dev machines. This PR updates `setup` documentation in Lighthouse book, and also fixes a small typo.
* move length update outside of if let in LRU cache
* add comment and use hex for G1_POINT_AT_INFINITY
* remove some misleading comments from `ssz_snappy`
* make sure we can't overflow on blobs by range requests with large counts
* downgrade gossip verification internal availability check error
* change blob rpc responses from BlockingFnWithManualSendOnIdle to BlockingFn
* remove unnecessary collect in blobs by range response
* add a comment to blobs by range response start slot logic
* typo persist_data_availabilty_checker -> persist_data_availability_checker
* unify cheap_state_advance_to_obtain_committees
## Issue Addressed
#4402
## Proposed Changes
This PR adds QUIC support to Lighthouse. As this is not officially spec'd this will only work between lighthouse <-> lighthouse connections. We attempt a QUIC connection (if the node advertises it) and if it fails we fallback to TCP.
This should be a backwards compatible modification. We want to test this functionality on live networks to observe any improvements in bandwidth/latency.
NOTE: This also removes the websockets transport as I believe no one is really using it. It should be mentioned in our release however.
Co-authored-by: João Oliveira <hello@jxs.pt>
* increase the max topic subscriptions #4581
* make the max_subscription limitation based off constants / configuration
* format
* wording & add deneb topic array
* reduce max_subscriptions_per_request to 2x
* format
* update comment
* Update comments and small cleanup.
* Deserialize into `SsePayloadAttributesV3` for Deneb fork. Update `SignedBlockContents::blobs_cloned` to return blobs for `BlindedBlockAndBlobSidecars`.
* Improve code readability and error handling when converting blinded block into full block.