* add metrics layer
* add metrics
* simplify getting the target
* make clippy happy
* fix typos
* unify deps under workspace
* make import statement shorter, fix typos
* enable warn by default, mark flag as deprecated
* do not exit on error when initializing logging fails
* revert exit on error
* adjust bootnode logging
* add logging layer
* non blocking file writer
* non blocking file writer
* add tracing visitor
* use target as is by default
* make libp2p events register correctly
* adjust repilcated cli help
* refactor tracing layer
* linting
* filesize
* log gossipsub, dont filter by log level
* turn on debug logs by default, remove deprecation warning
* truncate file, add timestamp, add unit test
* suppress output (#5)
* use tracing appender
* cleanup
* Add a task to remove old log files and upgrade to warn level
* Add the time feature for tokio
* Udeps and fmt
---------
Co-authored-by: Diva M <divma@protonmail.com>
Co-authored-by: Divma <26765164+divagant-martian@users.noreply.github.com>
Co-authored-by: Age Manning <Age@AgeManning.com>
* add metrics layer
* add metrics
* simplify getting the target
* make clippy happy
* fix typos
* unify deps under workspace
* make import statement shorter, fix typos
* enable warn by default, mark flag as deprecated
* do not exit on error when initializing logging fails
* revert exit on error
* adjust bootnode logging
* use target as is by default
* make libp2p events register correctly
* adjust repilcated cli help
* turn on debug logs by default, remove deprecation warning
* suppress output (#5)
---------
Co-authored-by: Age Manning <Age@AgeManning.com>
* switch libp2p source to sigp fork
* Shift the connection closing inside RPC behaviour
* Tag specific commits
* Add slow peer scoring
* Fix test
* Use default yamux config
* Pin discv5 to our libp2p fork and cargo update
* Upgrade libp2p to enable yamux gains
* Add a comment specifying the branch being used
* cleanup build output from within container
(prevents CI warnings related to fs permissions)
* Remove revision tags add branches for testing, will revert back once we're happy
* Update to latest rust-libp2p version
* Pin forks
* Update cargo.lock
* Re-pin to panic-free rust
---------
Co-authored-by: Age Manning <Age@AgeManning.com>
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: antondlr <anton@delaruelle.net>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
* Initial attempt to upgrade hashbrown to latest to get rid of the crate warnings.
* Replace `.expect()` usage with use of `const`.
* Update ahash 0.7 as well
* Remove unsafe code.
* Update `lru` to 0.12 and fix release test errors.
* Set non-blocking socket
* Bump testcontainers to 0.15.
* Fix lint
---------
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
* update libp2p and address compiler errors
* remove bandwidth logging from transport
* use libp2p registry
* make clippy happy
* use rust 1.73
* correct rpc keep alive
* remove comments and obsolte code
* remove libp2p prefix
* make clippy happy
* use quic under facade
* remove fast msg id
* bubble up close statements
* fix wrong comment
* Track time to early attester cache
* Add debug log for early attester cache
* Add mock-el command to lcli
* Revert beacon chain changes
* Tidy, fix compilation errors
* Add required flag
* Default to SYNCING response
* Hide flag
* multiple broadcast flags
* rewrite with single --broadcast option
* satisfy cargo fmt
* shorten sync-committee-messages
* fix a doc comment and a test
* use strum
* Add broadcast test to simulator
* bring --disable-run-on-all flag back with deprecation notice
## Issue Addressed
Closes#4817.
## Proposed Changes
- Fill in the linear block roots array between 0 and the slot of the first block (e.g. slots 0 and 1 on Holesky).
- Backport the `--freezer`, `--skip` and `--limit` options for `lighthouse db inspect` from tree-states. This allows us to easily view the database corruption of 4817 using `lighthouse db inspect --network holesky --freezer --column bbr --output values --limit 2`.
- Backport the `iter_column_from` change and `MemoryStore` overhaul from tree-states. These are required to enable `lighthouse db inspect`.
- Rework `freezer_upper_limit` to allow state lookups for slots below the `state_lower_limit`. Currently state lookups will fail until state reconstruction completes entirely.
There is a new regression test for the main bug, but no test for the `freezer_upper_limit` fix because we don't currently support running state reconstruction partially (see #3026). This will be fixed once we merge `tree-states`! In lieu of an automated test, I've tested manually on a Holesky node while it was reconstructing.
## Additional Info
Users who backfilled Holesky to slot 0 (e.g. using `--reconstruct-historic-states`) need to either:
- Re-sync from genesis.
- Re-sync using checkpoint sync and the changes from this PR.
Due to the recency of the Holesky genesis, writing a custom pass to fix up broken databases (which would require its own thorough testing) was deemed unnecessary. This is the primary reason for this PR being marked `backwards-incompat`.
This will create few conflicts with Deneb, which I've already resolved on `tree-states-deneb` and will be happy to backport to Deneb once this PR is merged to unstable.
## Issue Addressed
Makes lighthouse compliant with new kzg changes in https://github.com/ethereum/consensus-specs/releases/tag/v1.4.0-beta.3
## Proposed Changes
1. Adds new official trusted setup
2. Refactors kzg to match upstream changes in https://github.com/ethereum/c-kzg-4844/pull/377
3. Updates pre-generated `BlobBundle` to work with official trusted setup. ~~Using json here instead of ssz to account for different value of `MaxBlobCommitmentsPerBlock` in minimal and mainnet. By using json, we can just use one pre generated bundle for both minimal and mainnet. Size of 2 separate ssz bundles is approximately equal to one json bundle cc @jimmygchen~~
Dunno what I was doing, ssz works without any issues
4. Stores trusted_setup as just bytes in eth2_network_config so that we don't have kzg dependency in that lib and in lcli.
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: realbigsean <seananderson33@GMAIL.com>
## Issue Addressed
updates libp2p to the latest version and uses the new `SwarmBuilder`. Superseeds https://github.com/sigp/lighthouse/pull/4695/
CC @mxinden I don't think we can use both `bandwidth_loggers` with the new syntax right?
## Proposed Changes
Add `compare_fields(as_iter)` as a field attribute to `compare_fields_derive`. This allows any iterable type to be compared in the same as a slice (by index).
This is forwards-compatible with tree-states types like `List` and `Vector` which can not be cast to slices.
## Issue Addressed
Addresses #4778, and potentially fixes the flaky deneb builder test `builder_works_post_deneb`.
The [deneb builder test](c5c84f1213/beacon_node/http_api/tests/tests.rs (L5371)) has been quite flaky on our CI (`release-tests`) since it was introduced. I'm guessing that it might be timing out on the builder `get_header` call (1 second), and therefore the local payload is used, while the test expects builder payload to be used.
On my machine the [`get_header` ](c5c84f1213/beacon_node/execution_layer/src/test_utils/mock_builder.rs (L367)) call takes about 550ms, which could easily go over 1s on slower environments (our windows CI runner is much slower than the ubuntu one).
I did a profile on the test and it showed that `blob_to_kzg_commiment` and `compute_kzg_proof` was taking a large chunk of time, so perhaps pre-generating the blobs could help stablise this test.
## Proposed Changes
Pre-generate blobs bundle for Mainnet and Minimal presets.
Before the change `get_header` took about **550ms**, and it's now reduced to **50-55ms** after the change. If timeout was indeed the cause of the flaky test, this fix should stablise it. This also brings the flaky `builder_works_post_deneb` test time from 50s to 10s. (8s if we only use a single blob)
* use workspace deps in kzg crate
* delete unused blobs dp path field
* full match on fork name in engine api get payload v3
* only accept v3 payloads on get payload v3 endpoint in mock el
* remove FIXMEs related to merge transition tests
* move static tx to test utils
* default max_per_epoch_activation_churn_limit to mainnet value
* remove unnecessary async
* remove comment
* use task executor in `blob_sidecars` endpoint
## Proposed Changes
- only use LH types to avoid build issues
- use warp instead of axum for the server to avoid importing the dep
## Additional Info
- wondering if we can move the `execution_layer/test_utils` to its own crate and import it as a dev dependency
- this would be made easier by separating out our engine API types into their own crate so we can use them in the test crate
- or maybe we can look into using reth types for the engine api if they are in their own crate
Co-authored-by: realbigsean <seananderson33@gmail.com>
* add processing and processed caching to the DA checker
* move processing cache out of critical cache
* get it compiling
* fix lints
* add docs to `AvailabilityView`
* some self review
* fix lints
* fix beacon chain tests
* cargo fmt
* make availability view easier to implement, start on testing
* move child component cache and finish test
* cargo fix
* cargo fix
* cargo fix
* fmt and lint
* make blob commitments not optional, rename some caches, add missing blobs struct
* Update beacon_node/beacon_chain/src/data_availability_checker/processing_cache.rs
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
* marks review feedback and other general cleanup
* cargo fix
* improve availability view docs
* some renames
* some renames and docs
* fix should delay lookup logic
* get rid of some wrapper methods
* fix up single lookup changes
* add a couple docs
* add single blob merge method and improve process_... docs
* update some names
* lints
* fix merge
* remove blob indices from lookup creation log
* remove blob indices from lookup creation log
* delayed lookup logging improvement
* check fork choice before doing any blob processing
* remove unused dep
* Update beacon_node/beacon_chain/src/data_availability_checker/availability_view.rs
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* Update beacon_node/beacon_chain/src/data_availability_checker/availability_view.rs
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* Update beacon_node/beacon_chain/src/data_availability_checker/availability_view.rs
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* Update beacon_node/beacon_chain/src/data_availability_checker/availability_view.rs
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* Update beacon_node/network/src/sync/block_lookups/delayed_lookup.rs
Co-authored-by: Michael Sproul <micsproul@gmail.com>
* remove duplicate deps
* use gen range in random blobs geneartor
* rename processing cache fields
* require block root in rpc block construction and check block root consistency
* send peers as vec in single message
* spawn delayed lookup service from network beacon processor
* fix tests
---------
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Attempting to improve our CI speeds as its recently been a pain point.
Major changes:
- Use a github action to pull stable/nightly rust rather than building it each run
- Shift test suite to `nexttest` https://github.com/nextest-rs/nextest for CI
UPDATE:
So I've iterated on some changes, and although I think its still not optimal I think this is a good base to start from. Some extra things in this PR:
- Shifted where we pull rust from. We're now using this thing: https://github.com/moonrepo/setup-rust . It's got some interesting cache's built in, but was not seeing the gains that Jimmy managed to get. In either case tho, it can pull rust, cargofmt, clippy, cargo nexttest all in < 5s. So I think it's worthwhile.
- I've grouped a few of the check-like tests into a single test called `code-test`. Although we were using github runners in parallel which may be faster, it just seems wasteful. There were like 4-5 tests, where we would pull lighthouse, compile it, then run an action, like clippy, cargo-audit or fmt. I've grouped these into a single action, so we only compile lighthouse once, then in each step we run the checks. This avoids compiling lighthouse like 5 times.
- Ive made doppelganger tests run on our local machines to avoid pulling foundry, building and making lcli which are all now baked into the images.
- We have sccache and do not incremental compile lighthouse
Misc bonus things:
- Cargo update
- Fix web3 signer openssl keys which is required after a cargo update
- Use mock_instant in an LRU cache test to avoid non-deterministic test
- Remove race condition in building web3signer tests
There's still some things we could improve on. Such as downloading the EF tests every run and the web3-signer binary, but I've left these to be out of scope of this PR. I think the above are meaningful improvements.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: antondlr <anton@delaruelle.net>
## Issue Addressed
Synchronize dependencies and edition on the workspace `Cargo.toml`
## Proposed Changes
with https://github.com/rust-lang/cargo/issues/8415 merged it's now possible to synchronize details on the workspace `Cargo.toml` like the metadata and dependencies.
By only having dependencies that are shared between multiple crates aligned on the workspace `Cargo.toml` it's easier to not miss duplicate versions of the same dependency and therefore ease on the compile times.
## Additional Info
this PR also removes the no longer required direct dependency of the `serde_derive` crate.
should be reviewed after https://github.com/sigp/lighthouse/pull/4639 get's merged.
closes https://github.com/sigp/lighthouse/issues/4651
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
NA
## Proposed Changes
Bumps `quinn-proto` to address a QUIC-related vulnerability: https://rustsec.org/advisories/RUSTSEC-2023-0063
Fixes a `cargo audit` failure.
## Additional Info
NA
## Issue Addressed
#4738
## Proposed Changes
See the above issue for details. Went with option #2 to use the async reqwest client in `Eth2NetworkConfig` and propagate the async-ness.
## Issue Addressed
CI is plagued by `AddrAlreadyInUse` failures, which are caused by race conditions in allocating free ports.
This PR removes all usages of the `unused_port` crate for Lighthouse's HTTP API, in favour of passing `:0` as the listen address. As a result, the listen address isn't known ahead of time and must be read from the listening socket after it binds. This requires tying some self-referential knots, which is a little disruptive, but hopefully doesn't clash too much with Deneb 🤞
There are still a few usages of `unused_tcp4_port` left in cases where we start external processes, like the `watch` Postgres DB, Anvil, Geth, Nethermind, etc. Removing these usages is non-trivial because it's hard to read the port back from an external process after starting it with `--port 0`. We might be able to do something on Linux where we read from `/proc/`, but I'll leave that for future work.
* Implement `SignedBlockContent` decoding and fixed bug in `SignedBlockContent::new`
* Update Cargo.lock file
* Use `make_genesis_spec` to simplify test setup.
* Fix syntax errors.
## Issue Addressed
#4402
## Proposed Changes
This PR adds QUIC support to Lighthouse. As this is not officially spec'd this will only work between lighthouse <-> lighthouse connections. We attempt a QUIC connection (if the node advertises it) and if it fails we fallback to TCP.
This should be a backwards compatible modification. We want to test this functionality on live networks to observe any improvements in bandwidth/latency.
NOTE: This also removes the websockets transport as I believe no one is really using it. It should be mentioned in our release however.
Co-authored-by: João Oliveira <hello@jxs.pt>
## Proposed Changes
New release to replace the cancelled v4.4.0 release.
This release includes the bugfix #4687 which avoids a deadlock that was present in v4.4.0.
## Additional Info
Awaiting testing over the weekend this will be merged Monday September 4th.
## Issue Addressed
Fix a bug in the storage of the linear block roots array in the freezer DB. Previously this array was always written as part of state storage (or block backfill). With state pruning enabled by #4610, these states were no longer being written and as a result neither were the block roots.
The impact is quite low, we would just log an error when trying to forwards-iterate the block roots, which for validating nodes only happens when they try to look up blocks for peers:
> Aug 25 03:42:36.980 ERRO Missing chunk in forwards iterator chunk index: 49726, service: freezer_db
Any node checkpoint synced off `unstable` is affected and has a corrupt database. If you see the log above, you need to re-sync with the fix. Nodes that haven't checkpoint synced recently should _not_ be corrupted, even if they ran the buggy version.
## Proposed Changes
- Use a `ChunkWriter` to write the block roots when states are not being stored.
- Tweak the usage of `get_latest_restore_point` so that it doesn't return a nonsense value when state pruning is enabled.
- Tweak the guarantee on the block roots array so that block roots are assumed available up to the split slot (exclusive). This is a bit nicer than relying on anything to do with the latest restore point, which is a nonsensical concept when there aren't any restore points.
## Additional Info
I'm looking forward to deleting the chunked vector code for good when we merge tree-states 😁
## Issue Addressed
NA
## Proposed Changes
Add the Holesky network config as per 36e4ff2d51/custom_config_data.
Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see #4564 for context). To download this file we have:
- A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs)
- If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket.
- If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag.
- Whenever a genesis state is downloaded it is checked against a checksum baked into the binary.
- A genesis state will never be downloaded if it's already included in the binary.
- There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file.
## Log Output
Example of log output when a state is downloaded:
```bash
Aug 23 05:40:13.424 INFO Logging to file path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log"
Aug 23 05:40:13.425 INFO Lighthouse started version: Lighthouse/v4.3.0-bd9931f+
Aug 23 05:40:13.425 INFO Configured for network name: holesky
Aug 23 05:40:13.426 INFO Data directory initialised datadir: /Users/paul/.lighthouse/holesky
Aug 23 05:40:13.427 INFO Deposit contract address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Aug 23 05:40:13.427 INFO Downloading genesis state info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/
Aug 23 05:40:29.895 INFO Starting from known genesis state service: beacon
```
Example of log output when there are no URLs specified:
```
Aug 23 06:29:51.645 INFO Logging to file path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log"
Aug 23 06:29:51.646 INFO Lighthouse started version: Lighthouse/v4.3.0-666a39c+
Aug 23 06:29:51.646 INFO Configured for network name: goerli
Aug 23 06:29:51.647 INFO Data directory initialised datadir: /Users/paul/.lighthouse/goerli
Aug 23 06:29:51.647 INFO Deposit contract address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322
The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url.
```
## Additional Info
I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉
My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there.
I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states.
When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
## Issue Addressed
updates underlying dependencies and removes the ignored `RUSTSEC`'s for `cargo audit`.
Also switches `procinfo` to `procfs` on `eth2` to remove the `nom` warning, `procinfo` is unmaintained see [here](https://github.com/danburkert/procinfo-rs/issues/46).
## Issue Addressed
Fixes a bug in the handling of `--beacon-process-max-workers` which caused it to have no effect.
## Proposed Changes
For this PR I channeled @ethDreamer and saw deep into the faulty CLI config -- this bug is almost identical to the one Mark found and fixed in #4622.
* remove protoc and token from network tests github action
* delete unused beacon chain methods
* downgrade writing blobs to store log
* reduce diff in block import logic
* remove some todo's and deneb built in network
* remove unnecessary error, actually use some added metrics
* remove some metrics, fix missing components on publish funcitonality
* fix status tests
* rename sidecar by root to blobs by root
* clean up some metrics
* remove unnecessary feature gate from attestation subnet tests, clean up blobs by range response code
* pawan's suggestion in `protocol_info`, peer score in matching up batch sync block and blobs
* fix range tests for deneb
* pub block and blob db cache behind the same mutex
* remove unused errs and an empty file
* move sidecar trait to new file
* move types from payload to eth2 crate
* update comment and add flag value name
* make function private again, remove allow unused
* use reth rlp for tx decoding
* fix compile after merge
* rename kzg commitments
* cargo fmt
* remove unused dep
* Update beacon_node/execution_layer/src/lib.rs
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
* Update beacon_node/beacon_processor/src/lib.rs
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
* pawan's suggestiong for vec capacity
* cargo fmt
* Revert "use reth rlp for tx decoding"
This reverts commit 5181837d81c66dcca4c960a85989ac30c7f806e2.
* remove reth rlp
---------
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
* Update mock builder, mev-rs dependencies, eth2 lib to support deneb builder flow
* Replace `sharingForkTime` with `cancunTime`
* Patch `ethereum-consensus` to include some deneb-devnet-8 changes
* Add deneb builder test and fix block contents deserialization
* Fix builder bid encoding issue and passing deneb builder test \o/
* Fix test compilation
* Revert `cancunTime` change in genesis to pass doppelganger tests
## Proposed Changes
This PR updates `blst` to 0.3.11, which gives us _runtime detection of CPU features_ 🎉
Although [performance benchmarks](https://gist.github.com/michaelsproul/f759fa28dfa4003962507db34b439d6c) don't show a substantial detriment to running the `portable` build vs `modern`, in order to take things slowly I propose the following roll-out strategy:
- Keep both `modern` and `portable` builds for releases/Docker images.
- Run the `portable` build on half of SigP's infrastructure to monitor for performance deficits.
- Listen out for user issues with the `portable` builds (e.g. SIGILLs from misdetected hardware).
- Make the `portable` build the default and remove the `modern` build from our release binaries & Docker images.