57dfcfd83a
## Issue Addressed
Closes #3963 (hopefully)
## Proposed Changes
Compute attestation selection proofs gradually each slot rather than in a single `join_all` at the start of each epoch. On a machine with 5k validators this replaces 5k tasks signing 5k proofs with 1 task that signs 5k/32 ~= 160 proofs each slot.
Based on testing with Goerli validators this seems to reduce the average time to produce a signature by preventing Tokio and the OS from falling over each other trying to run hundreds of threads. My testing so far has been with local keystores, which run on a dynamic pool of up to 512 OS threads because they use [`spawn_blocking`](https://docs.rs/tokio/1.11.0/tokio/task/fn.spawn_blocking.html) (and we haven't changed the default).
An earlier version of this PR hyper-optimised the time-per-signature metric to the detriment of the entire system's performance (see the reverted commits). The current PR is conservative in that it avoids touching the attestation service at all. I think there's more optimising to do here, but we can come back for that in a future PR rather than expanding the scope of this one.
The new algorithm for attestation selection proofs is:
- We sign a small batch of selection proofs each slot, for slots up to 8 slots in the future. On average we'll sign one slot's worth of proofs per slot, with an 8 slot lookahead.
- The batch is signed halfway through the slot when there is unlikely to be contention for signature production (blocks are <4s, attestations are ~4-6 seconds, aggregates are 8s+).
## Performance Data
_See first comment for updated graphs_.
Graph of median signing times before this PR:
![signing_times_median](https://user-images.githubusercontent.com/4452260/221495627-3ab3c105-319f-406e-b99d-b5913e0ded9c.png)
Graph of update attesters metric (includes selection proof signing) before this PR:
![update_attesters_store](https://user-images.githubusercontent.com/4452260/221497057-01ba40e4-8148-45f6-9e21-36a9567a631a.png)
Median signing time after this PR (prototype from 12:00, updated version from 13:30):
![signing_times_median_updated](https://user-images.githubusercontent.com/4452260/221771578-47a040cc-b832-482f-9a1a-d1bd9854e00e.png)
99th percentile on signing times (bounded attestation signing from 16:55, now removed):
![signing_times_99pc](https://user-images.githubusercontent.com/4452260/221772055-e64081a8-2220-45ba-ba6d-9d7e344a5bde.png)
Attester map update timing after this PR:
![update_attesters_store_updated](https://user-images.githubusercontent.com/4452260/221771757-c8558a48-7f4e-4bb5-9929-dee177a66c1e.png)
Selection proof signings per second change:
![signing_attempts](https://user-images.githubusercontent.com/4452260/221771855-64f5da22-1655-478d-926b-810be8a3650c.png)
## Link to late blocks
I believe this is related to the slow block signings because logs from Stakely in #3963 show these two logs almost 5 seconds apart:
> Feb 23 18:56:23.978 INFO Received unsigned block, slot: 5862880, service: block, module: validator_client::block_service:393
> Feb 23 18:56:28.552 INFO Publishing signed block, slot: 5862880, service: block, module: validator_client::block_service:416
The only thing that happens between those two logs is the signing of the block:
|
||
---|---|---|
.cargo | ||
.github | ||
account_manager | ||
beacon_node | ||
book | ||
boot_node | ||
common | ||
consensus | ||
crypto | ||
database_manager | ||
lcli | ||
lighthouse | ||
scripts | ||
slasher | ||
testing | ||
validator_client | ||
.dockerignore | ||
.editorconfig | ||
.gitignore | ||
.gitmodules | ||
bors.toml | ||
Cargo.lock | ||
Cargo.toml | ||
CONTRIBUTING.md | ||
Cross.toml | ||
Dockerfile | ||
Dockerfile.cross | ||
LICENSE | ||
Makefile | ||
README.md | ||
SECURITY.md |
Lighthouse: Ethereum consensus client
An open-source Ethereum consensus client, written in Rust and maintained by Sigma Prime.
Overview
Lighthouse is:
- Ready for use on Ethereum consensus mainnet.
- Fully open-source, licensed under Apache 2.0.
- Security-focused. Fuzzing techniques have been continuously applied and several external security reviews have been performed.
- Built in Rust, a modern language providing unique safety guarantees and excellent performance (comparable to C++).
- Funded by various organisations, including Sigma Prime, the Ethereum Foundation, ConsenSys, the Decentralization Foundation and private individuals.
- Actively involved in the specification and security analysis of the Ethereum proof-of-stake consensus specification.
Staking Deposit Contract
The Lighthouse team acknowledges
0x00000000219ab540356cBB839Cbe05303d7705Fa
as the canonical staking deposit contract address.
Documentation
The Lighthouse Book contains information for users and developers.
The Lighthouse team maintains a blog at lighthouse-blog.sigmaprime.io which contains periodical progress updates, roadmap insights and interesting findings.
Branches
Lighthouse maintains two permanent branches:
stable
: Always points to the latest stable release.- This is ideal for most users.
unstable
: Used for development, contains the latest PRs.- Developers should base their PRs on this branch.
Contributing
Lighthouse welcomes contributors.
If you are looking to contribute, please head to the Contributing section of the Lighthouse book.
Contact
The best place for discussion is the Lighthouse Discord server.
Sign up to the Lighthouse Development Updates mailing list for email notifications about releases, network status and other important information.
Encrypt sensitive messages using our PGP key.
Donations
Lighthouse is an open-source project and a public good. Funding public goods is hard and we're grateful for the donations we receive from the community via:
- Gitcoin Grants.
- Ethereum address:
0x25c4a76E7d118705e7Ea2e9b7d8C59930d8aCD3b
(donation.sigmaprime.eth).