Attempting to improve our CI speeds as its recently been a pain point.
Major changes:
- Use a github action to pull stable/nightly rust rather than building it each run
- Shift test suite to `nexttest` https://github.com/nextest-rs/nextest for CI
UPDATE:
So I've iterated on some changes, and although I think its still not optimal I think this is a good base to start from. Some extra things in this PR:
- Shifted where we pull rust from. We're now using this thing: https://github.com/moonrepo/setup-rust . It's got some interesting cache's built in, but was not seeing the gains that Jimmy managed to get. In either case tho, it can pull rust, cargofmt, clippy, cargo nexttest all in < 5s. So I think it's worthwhile.
- I've grouped a few of the check-like tests into a single test called `code-test`. Although we were using github runners in parallel which may be faster, it just seems wasteful. There were like 4-5 tests, where we would pull lighthouse, compile it, then run an action, like clippy, cargo-audit or fmt. I've grouped these into a single action, so we only compile lighthouse once, then in each step we run the checks. This avoids compiling lighthouse like 5 times.
- Ive made doppelganger tests run on our local machines to avoid pulling foundry, building and making lcli which are all now baked into the images.
- We have sccache and do not incremental compile lighthouse
Misc bonus things:
- Cargo update
- Fix web3 signer openssl keys which is required after a cargo update
- Use mock_instant in an LRU cache test to avoid non-deterministic test
- Remove race condition in building web3signer tests
There's still some things we could improve on. Such as downloading the EF tests every run and the web3-signer binary, but I've left these to be out of scope of this PR. I think the above are meaningful improvements.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: antondlr <anton@delaruelle.net>
## Proposed Changes
Another `tree-states` motivated PR, this adds `jemalloc` as the default allocator, with an option to use the system allocator by compiling with `FEATURES="" make`.
- [x] Metrics
- [x] Test on Windows
- [x] Test on macOS
- [x] Test with `musl`
- [x] Metrics dashboard on `lighthouse-metrics` (https://github.com/sigp/lighthouse-metrics/pull/37)
Co-authored-by: Michael Sproul <micsproul@gmail.com>
I've needed to do this work in order to do some episub testing.
This version of libp2p has not yet been released, so this is left as a draft for when we wish to update.
Co-authored-by: Diva M <divma@protonmail.com>
## Issue Addressed
Closes#3709
## Proposed Changes
Add the job `compile-with-beta-compiler` to `test-suite`. This job has the following steps:
1. Use `actions/checkout@v3`. (Needed to run make in a later step.)
2. Install the dependencies listed in [build from source guide](https://lighthouse-book.sigmaprime.io/installation-source.html).
3. Change the compiler to the current beta version with `rustup override`.
4. Run `make`.
## Issue Addressed
I noticed that [this build](https://github.com/sigp/lighthouse/actions/runs/3269950873/jobs/5378036501) wasn't marked failed by Bors when the `syncing-simulator-ubuntu` job failed. This is because that job is absent from the `bors.toml` config.
## Proposed Changes
Add missing jobs to Bors config so that they are required:
- `syncing-simulator-ubuntu`
- `slasher-tests`
- `disallowed-from-async-lint`
The `disallowed-from-async-lint` was previously allowed to fail because it was considered beta, but I think it's stable enough now we may as well require it.
## Issue Addressed
N/A
## Proposed Changes
Make simulator merge compatible. Adds a `--post_merge` flag to the eth1 simulator that enables a ttd and simulates the merge transition. Uses the `MockServer` in the execution layer test utils to simulate a dummy execution node.
Adds the merge transition simulation to CI.
## Proposed Changes
Set a minimum supported Rust version (MSRV) in the `Cargo.toml` for the Lighthouse binary so that attempts to compile it with an outdated compiler fail immediately with a clear error.
To ensure that the codebase builds with the MSRV I've also added a Github actions job that runs `cargo check` using the MSRV extracted from `Cargo.toml`. This will force us to keep it up to date.
I opted to use `cargo check` rather than Clippy because Clippy frequently introduces new lints that we adopt, so our MSRV for Clippy is usually the most recent Rust version, while the MSRV for building Lighthouse is older.
## Issue Addressed
As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec).
Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in:
- https://github.com/ethereum/execution-apis/pull/180
- https://github.com/ethereum/consensus-specs/pull/2835
With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things.
We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup.
### Changes to exec integration tests
There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue.
Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients.
## More Info
- [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
## Issue Addressed
Timeouts due to Windows builds running for 2h 20m.
## Proposed Changes
* Increase Bors timeout to 3h
* Refine the target branch check so that it will pass when we make PRs to feature branches. This is just an extra change I've been meaning to sneak in for a while.
## Additional Info
* I think it would also be cool to try caching for CI again, but that's a separate issue and we'll still need the long timeout on a cache miss.
Attempt to prevent accidental merges to `stable` due to GitHub's default behaviour of opening PRs against it.
I've intentionally opened this PR against `stable` to test the functionality ;)
## Issue Addressed
N/A
## Proposed Changes
Remove the MacOs tests. They routinely fail, causing bors to retry and slowing down the whole merge process.
## Additional Info
N/A
Co-authored-by: Michael Sproul <michael@sigmaprime.io>