lighthouse

Author	SHA1	Message	Date
Paul Hauner	be4e261e74	Use async code when interacting with EL (#3244 ) ## Overview This rather extensive PR achieves two primary goals: 1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state. 2. Refactors fork choice, block production and block processing to `async` functions. Additionally, it achieves: - Concurrent forkchoice updates to the EL and cache pruning after a new head is selected. - Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production. - Concurrent per-block-processing and execution payload verification during block processing. - The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?): - I had to do this to deal with sending blocks into spawned tasks. - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones. - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap. - Avoids cloning all the blocks in every chain segment during sync. - It also has the potential to clean up our code where we need to pass an owned block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅) - The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs. For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273 ## Changes to `canonical_head` and `fork_choice` Previously, the `BeaconChain` had two separate fields: ``` canonical_head: RwLock<Snapshot>, fork_choice: RwLock<BeaconForkChoice> ``` Now, we have grouped these values under a single struct: ``` canonical_head: CanonicalHead { cached_head: RwLock<Arc<Snapshot>>, fork_choice: RwLock<BeaconForkChoice> } ``` Apart from ergonomics, the only actual change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously. ## Breaking Changes ### The `state` (root) field in the `finalized_checkpoint` SSE event Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event: 1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`. 4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots. Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](`de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)`) it uses [`getStateRootFromBlockRoot`](`de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)`) which uses (1). I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku. ## Notes for Reviewers I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct. I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking". I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run at least once means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows when it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it. I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around. Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2. You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests: - Changing tests to be `tokio::async` tests. - Adding `.await` to fork choice, block processing and block production functions. - Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`. - Wrapping `SignedBeaconBlock` in an `Arc`. - In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant. I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic. Co-authored-by: Mac L <mjladson@pm.me>	2022-07-03 05:36:50 +00:00
Paul Hauner	27e83b888c	Retrospective invalidation of exec. payloads for opt. sync (#2837 ) ## Issue Addressed NA ## Proposed Changes Adds the functionality to allow blocks to be validated/invalidated after their import as per the [optimistic sync spec](https://github.com/ethereum/consensus-specs/blob/dev/sync/optimistic.md#how-to-optimistically-import-blocks). This means: - Updating `ProtoArray` to allow flipping the `execution_status` of ancestors/descendants based on payload validity updates. - Creating separation between `execution_layer` and the `beacon_chain` by creating a `PayloadStatus` struct. - Refactoring how the `execution_layer` selects a `PayloadStatus` from the multiple statuses returned from multiple EEs. - Adding testing framework for optimistic imports. - Add `ExecutionBlockHash(Hash256)` new-type struct to avoid confusion between beacon block roots and execution payload hashes. - Add `merge` to [`FORKS`](`c3a793fd73/Makefile (L17)`) in the `Makefile` to ensure we test the beacon chain with merge settings. - Fix some tests here that were failing due to a missing execution layer. ## TODO - [ ] Balance tests Co-authored-by: Mark Mackey <mark@sigmaprime.io>	2022-02-28 22:07:48 +00:00
Paul Hauner	801f6f7425	Disable autotests for beacon_chain (#2658 )	2021-12-02 14:26:52 +11:00
Paul Hauner	e2d09bb8ac	Add `BeaconChainHarness::builder` (#2707 ) ## Issue Addressed NA ## Proposed Changes This PR is near-identical to https://github.com/sigp/lighthouse/pull/2652, however it is to be merged into `unstable` instead of `merge-f2f`. Please see that PR for reasoning. I'm making this duplicate PR to merge to `unstable` in an effort to shrink the diff between `unstable` and `merge-f2f` by doing smaller, lead-up PRs. ## Additional Info NA	2021-10-14 02:58:10 +00:00
Michael Sproul	b4689e20c6	Altair consensus changes and refactors (#2279 ) ## Proposed Changes Implement the consensus changes necessary for the upcoming Altair hard fork. ## Additional Info This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`. Co-authored-by: realbigsean <seananderson33@gmail.com>	2021-07-09 06:15:32 +00:00
Michael Sproul	363f15f362	Use the database to persist the pubkey cache (#2234 ) ## Issue Addressed Closes #1787 ## Proposed Changes * Abstract the `ValidatorPubkeyCache` over a "backing" which is either a file (legacy), or the database. * Implement a migration from schema v2 to schema v3, whereby the contents of the cache file are copied to the DB, and then the file is deleted. The next release to include this change must be a minor version bump, and we will need to warn users of the inability to downgrade (this is our first DB schema change since mainnet genesis). * Move the schema migration code from the `store` crate into the `beacon_chain` crate so that it can access the datadir and the `ValidatorPubkeyCache`, etc. It gets injected back into the `store` via a closure (similar to what we do in fork choice).	2021-03-04 01:25:12 +00:00
Michael Sproul	703c33bdc7	Fix head tracker concurrency bugs (#1771 ) ## Issue Addressed Closes #1557 ## Proposed Changes Modify the pruning algorithm so that it mutates the head-tracker _before_ committing the database transaction to disk, and _only if_ all the heads to be removed are still present in the head-tracker (i.e. no concurrent mutations). In the process of writing and testing this I also had to make a few other changes: * Use internal mutability for all `BeaconChainHarness` functions (namely the RNG and the graffiti), in order to enable parallel calls (see testing section below). * Disable logging in harness tests unless the `test_logger` feature is turned on And chose to make some clean-ups: * Delete the `NullMigrator` * Remove type-based configuration for the migrator in favour of runtime config (simpler, less duplicated code) * Use the non-blocking migrator unless the blocking migrator is required. In the store tests we need the blocking migrator because some tests make asserts about the state of the DB after the migration has run. * Rename `validators_keypairs` -> `validator_keypairs` in the `BeaconChainHarness` ## Testing To confirm that the fix worked, I wrote a test using [Hiatus](https://crates.io/crates/hiatus), which can be found here: https://github.com/michaelsproul/lighthouse/tree/hiatus-issue-1557 That test can't be merged because it inserts random breakpoints everywhere, but if you check out that branch you can run the test with: ``` $ cd beacon_node/beacon_chain $ cargo test --release --test parallel_tests --features test_logger ``` It should pass, and the log output should show: ``` WARN Pruning deferred because of a concurrent mutation, message: this is expected only very rarely! ``` ## Additional Info This is a backwards-compatible change with no impact on consensus.	2020-10-19 05:58:39 +00:00
Adam Szkoda	d9f4819fe0	Alternative (to BeaconChainHarness) BeaconChain testing API (#1380 ) The PR: * Adds the ability to generate a crucial test scenario that isn't possible with `BeaconChainHarness` (i.e. two blocks occupying the same slot; previously forks necessitated skipping slots): ![image](https://user-images.githubusercontent.com/165678/88195404-4bce3580-cc40-11ea-8c08-b48d2e1d5959.png) * New testing API: Instead of repeatedly calling add_block(), you generate a sorted `Vec<Slot>` and leave it up to the framework to generate blocks at those slots. * Jumping backwards to an earlier epoch is a hard error, so that tests necessarily generate blocks in a epoch-by-epoch manner. * Configures the test logger so that output is printed on the console in case a test fails. The logger also plays well with `--nocapture`, contrary to the existing testing framework * Rewrites existing fork pruning tests to use the new API * Adds a tests that triggers finalization at a non epoch boundary slot * Renamed `BeaconChainYoke` to `BeaconChainTestingRig` because the former has been too confusing * Fixed multiple tests (e.g. `block_production_different_shuffling_long`, `delete_blocks_and_states`, `shuffling_compatible_simple_fork`) that relied on a weird (and accidental) feature of the old `BeaconChainHarness` that attestations aren't produced for epochs earlier than the current one, thus masking potential bugs in test cases. Co-authored-by: Michael Sproul <michael@sigmaprime.io>	2020-08-26 09:24:55 +00:00
Paul Hauner	b73c497be2	Support multiple BLS implementations (#1335 ) ## Issue Addressed NA ## Proposed Changes - Refactor the `bls` crate to support multiple BLS "backends" (e.g., milagro, blst, etc). - Removes some duplicate, unused code in `common/rest_types/src/validator.rs`. - Removes the old "upgrade legacy keypairs" functionality (these were unencrypted keys that haven't been supported for a few testnets, no one should be using them anymore). ## Additional Info Most of the files changed are just inconsequential changes to function names. ## TODO - [x] Optimization levels - [x] Infinity point: https://github.com/supranational/blst/issues/11 - [x] Ensure milagro and blst are tested via CI - [x] What to do with unsafe code? - [x] Test infinity point in signature sets	2020-07-25 02:03:18 +00:00
Michael Sproul	bcb6afa0aa	Process exits and slashings off the network (#1253 ) * Process exits and slashings off the network * Fix rest_api tests * Add op verification tests * Add tests for pruning of slashings in the op pool * Address Paul's review comments	2020-06-18 21:06:34 +10:00

10 Commits