Commit Graph

104 Commits

Author SHA1 Message Date
Eitan Seri-Levi
521432129d Support SSZ request body for POST /beacon/blinded_blocks endpoints (v1 & v2) (#4504)
## Issue Addressed

#4262 

## Proposed Changes

add SSZ support in request body for POST /beacon/blinded_blocks endpoints (v1 & v2)

## Additional Info
2023-08-07 22:53:04 +00:00
Armağan Yıldırak
3397612160 Shift networking configuration (#4426)
## Issue Addressed
Addresses [#4401](https://github.com/sigp/lighthouse/issues/4401)

## Proposed Changes
Shift some constants into ```ChainSpec``` and remove the constant values from code space.

## Additional Info

I mostly used ```MainnetEthSpec::default_spec()``` for getting ```ChainSpec```. I wonder Did I make a mistake about that.


Co-authored-by: armaganyildirak <armaganyildirak@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: Age Manning <Age@AgeManning.com>
Co-authored-by: Diva M <divma@protonmail.com>
2023-08-03 01:51:47 +00:00
Eitan Seri-Levi
e8c411c288 add ssz support in request body for /beacon/blocks endpoints (v1 & v2) (#4479)
## Issue Addressed

[#4457](https://github.com/sigp/lighthouse/issues/4457)

## Proposed Changes

add ssz support in request body for  /beacon/blocks endpoints (v1 & v2)


## Additional Info
2023-07-31 23:51:37 +00:00
Aoi Kurokawa
85a3340d0e Implement liveness BeaconAPI (#4343)
## Issue Addressed

#4243

## Proposed Changes

- create a new endpoint for liveness/{endpoint}

## Additional Info
This is my first PR.
2023-07-31 01:53:03 +00:00
Michael Sproul
6c375205fb Fix HTTP state API bug and add --epochs-per-migration (#4236)
## Issue Addressed

Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API:

> {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]}

## Proposed Changes

The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls.

To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`.

## Additional Info

I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes:

- Spacing out database writes (less frequent, larger batches)
- Keeping a limited chain history with high availability, e.g. the last month in the hot database.

This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
2023-07-17 00:14:12 +00:00
Jack McPherson
62c9170755 Remove hidden re-exports to appease Rust 1.73 (#4495)
## Issue Addressed

#4494 

## Proposed Changes

 - Remove explicit re-exports of various types to appease the new compiler lint

## Additional Info

It seems `warn(hidden_glob_reexports)` is the main culprit.
2023-07-12 07:06:00 +00:00
Eitan Seri-Levi
826e090f50 Update node health endpoint (#4310)
## Issue Addressed

[#4292](https://github.com/sigp/lighthouse/issues/4292)

## Proposed Changes

Updated the node health endpoint

will return a 200 status code if  `!syncing && !el_offline && !optimistic`

wil return a 206 if `(syncing || optimistic) &&  !el_offline`

will return a 503 if `el_offline`



## Additional Info
2023-06-30 01:13:04 +00:00
Jack McPherson
1aff082eea Add broadcast validation routes to Beacon Node HTTP API (#4316)
## Issue Addressed

 - #4293 
 - #4264 

## Proposed Changes

*Changes largely follow those suggested in the main issue*.

 - Add new routes to HTTP API
   - `post_beacon_blocks_v2`
   - `post_blinded_beacon_blocks_v2`
 - Add new routes to `BeaconNodeHttpClient`
   - `post_beacon_blocks_v2`
   - `post_blinded_beacon_blocks_v2`
 - Define new Eth2 common types
   - `BroadcastValidation`, enum representing the level of validation to apply to blocks prior to broadcast
   - `BroadcastValidationQuery`, the corresponding HTTP query string type for the above type
 - ~~Define `_checked` variants of both `publish_block` and `publish_blinded_block` that enforce a validation level at a type level~~
 - Add interactive tests to the `bn_http_api_tests` test target covering each validation level (to their own test module, `broadcast_validation_tests`)
   - `beacon/blocks`
       - `broadcast_validation=gossip`
         - Invalid (400)
         - Full Pass (200)
         - Partial Pass (202)
        - `broadcast_validation=consensus`
          - Invalid (400)
          - Only gossip (400)
          - Only consensus pass (i.e., equivocates) (200)
          - Full pass (200)
        - `broadcast_validation=consensus_and_equivocation`
          - Invalid (400)
          - Invalid due to early equivocation (400)
          - Only gossip (400)
          - Only consensus (400)
          - Pass (200)
   - `beacon/blinded_blocks`
       - `broadcast_validation=gossip`
         - Invalid (400)
         - Full Pass (200)
         - Partial Pass (202)
        - `broadcast_validation=consensus`
          - Invalid (400)
          - Only gossip (400)
          - ~~Only consensus pass (i.e., equivocates) (200)~~
          - Full pass (200)
        - `broadcast_validation=consensus_and_equivocation`
          - Invalid (400)
          - Invalid due to early equivocation (400)
          - Only gossip (400)
          - Only consensus (400)
          - Pass (200)
 - Add a new trait, `IntoGossipVerifiedBlock`, which allows type-level guarantees to be made as to gossip validity
 - Modify the structure of the `ObservedBlockProducers` cache from a `(slot, validator_index)` mapping to a `((slot, validator_index), block_root)` mapping
 - Modify `ObservedBlockProducers::proposer_has_been_observed` to return a `SeenBlock` rather than a boolean on success
 - Punish gossip peer (low) for submitting equivocating blocks
 - Rename `BlockError::SlashablePublish` to `BlockError::SlashableProposal`

## Additional Info

This PR contains changes that directly modify how blocks are verified within the client. For more context, consult [comments in-thread](https://github.com/sigp/lighthouse/pull/4316#discussion_r1234724202).


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-06-29 12:02:38 +00:00
Michael Sproul
3052db29fe Implement el_offline and use it in the VC (#4295)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613.

## Proposed Changes

- Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if:
  - The EL's internal status is `Offline` or `AuthFailed`, _or_
  - The most recent call to `newPayload` resulted in an error (more on this in a moment).

- Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes.
- Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated.
- Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check.


## Why track `newPayload` errors?

Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because:

- If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)), which would have the effect of masking the failure and keeping the status _online_.
- The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms).
- If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them.

## Additional Changes

- Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too.
- Delete some stale comments from when we used to support multiple execution nodes.
2023-05-17 05:51:56 +00:00
Mac L
3c029d48bf DB migration for fork choice cleanup (#4265)
## Issue Addressed

#4233

## Proposed Changes

Remove the `best_justified_checkpoint` from the `PersistedForkChoiceStore` type as it is now unused.
Additionally, remove the `Option`'s wrapping the `justified_checkpoint` and `finalized_checkpoint` fields on `ProtoNode` which were only present to facilitate a previous migration.

Include the necessary code to facilitate the migration to a new DB schema.
2023-05-15 02:10:42 +00:00
Michael Sproul
b90c0c3fb1 Make re-org strat more cautious and add more config (#4151)
## Proposed Changes

This change attempts to prevent failed re-orgs by:

1. Lowering the re-org cutoff from 2s to 1s. This is informed by a failed re-org attempted by @yorickdowne's node. The failed block was requested in the 1.5-2s window due to a Vouch failure, and failed to propagate to the majority of the network before the attestation deadline at 4s.
2. Allow users to adjust their re-org cutoff depending on observed network conditions and their risk profile. The static 2 second cutoff was too rigid.
3. Add a `--proposer-reorg-disallowed-offsets` flag which can be used to prohibit reorgs at certain slots. This is intended to help workaround an issue whereby reorging blocks at slot 1 are currently taking ~1.6s to propagate on gossip rather than ~500ms. This is suspected to be due to a cache miss in current versions of Prysm, which should be fixed in their next release.

## Additional Info

I'm of two minds about removing the `shuffling_stable` check which checks for blocks at slot 0 in the epoch. If we removed it users would be able to configure Lighthouse to try reorging at slot 0, which likely wouldn't work very well due to interactions with the proposer index cache. I think we could leave it for now and revisit it later.
2023-04-13 07:05:01 +00:00
Mac L
0e2e23e088 Remove the unused ExecutionOptimisticForkVersionedResponse type (#4160)
## Issue Addressed

#4146 

## Proposed Changes

Removes the `ExecutionOptimisticForkVersionedResponse` type and the associated Beacon API endpoint which is now deprecated. Also removes the test associated with the endpoint.
2023-04-12 01:48:21 +00:00
Mac L
8630ddfec4 Add beacon.watch (#3362)
> This is currently a WIP and all features are subject to alteration or removal at any time.

## Overview

The successor to #2873.

Contains the backbone of `beacon.watch` including syncing code, the initial API, and several core database tables.

See `watch/README.md` for more information, requirements and usage.
2023-04-03 05:35:11 +00:00
Daniel Ramirez Chiquillo
036b797b2c Add finalized to HTTP API responses (#3753)
## Issue Addressed

#3708 

## Proposed Changes
- Add `is_finalized_block` method to `BeaconChain` in `beacon_node/beacon_chain/src/beacon_chain.rs`.
- Add `is_finalized_state` method to `BeaconChain` in `beacon_node/beacon_chain/src/beacon_chain.rs`.
- Add `fork_and_execution_optimistic_and_finalized` in `beacon_node/http_api/src/state_id.rs`.
- Add `ExecutionOptimisticFinalizedForkVersionedResponse` type in `consensus/types/src/fork_versioned_response.rs`.
- Add `execution_optimistic_finalized_fork_versioned_response`function in  `beacon_node/http_api/src/version.rs`.
- Add `ExecutionOptimisticFinalizedResponse` type in `common/eth2/src/types.rs`.
- Add `add_execution_optimistic_finalized` method in  `common/eth2/src/types.rs`.
- Update API response methods to include finalized.
- Remove `execution_optimistic_fork_versioned_response`

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-03-30 06:08:37 +00:00
Christopher Chong
6bb28bc806 Add debug fork choice api (#4003)
## Issue Addressed

Which issue # does this PR address?
https://github.com/sigp/lighthouse/issues/3669

## Proposed Changes

Please list or describe the changes introduced by this PR.
- A new API to fetch fork choice data, as specified [here](https://github.com/ethereum/beacon-APIs/pull/232)
- A new integration test to test the new API

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.

- `extra_data` field specified in the beacon-API spec is not implemented, please let me know if I should instead.

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-03-29 02:56:37 +00:00
Divma
e190ebb8a0 Support for Ipv6 (#4046)
## Issue Addressed
Add support for ipv6 and dual stack in lighthouse. 

## Proposed Changes
From an user perspective, now setting an ipv6 address, optionally configuring the ports should feel exactly the same as using an ipv4 address. If listening over both ipv4 and ipv6 then the user needs to:
- use the `--listen-address` two times (ipv4 and ipv6 addresses)
- `--port6` becomes then required
- `--discovery-port6` can now be used to additionally configure the ipv6 udp port

### Rough list of code changes
- Discovery:
  - Table filter and ip mode set to match the listening config. 
  - Ipv6 address, tcp port and udp port set in the ENR builder
  - Reported addresses now check which tcp port to give to libp2p
- LH Network Service:
  - Can listen over Ipv6, Ipv4, or both. This uses two sockets. Using mapped addresses is disabled from libp2p and it's the most compatible option.
- NetworkGlobals:
  - No longer stores udp port since was not used at all. Instead, stores the Ipv4 and Ipv6 TCP ports.
- NetworkConfig:
  - Update names to make it clear that previous udp and tcp ports in ENR were Ipv4
  - Add fields to configure Ipv6 udp and tcp ports in the ENR
  - Include advertised enr Ipv6 address.
  - Add type to model Listening address that's either Ipv4, Ipv6 or both. A listening address includes the ip, udp port and tcp port.
- UPnP:
  - Kept only for ipv4
- Cli flags:
  - `--listen-addresses` now can take up to two values
  - `--port` will apply to ipv4 or ipv6 if only one listening address is given. If two listening addresses are given it will apply only to Ipv4.
  - `--port6` New flag required when listening over ipv4 and ipv6 that applies exclusively to Ipv6.
  - `--discovery-port` will now apply to ipv4 and ipv6 if only one listening address is given.
  - `--discovery-port6` New flag to configure the individual udp port of ipv6 if listening over both ipv4 and ipv6.
  - `--enr-udp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
  - `--enr-udp6-port` Added to configure the enr udp6 field.
  - `--enr-tcp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
  - `--enr-tcp6-port` Added to configure the enr tcp6 field.
  - `--enr-addresses` now can take two values.
  - `--enr-match` updated behaviour.
- Common:
  - rename `unused_port` functions to specify that they are over ipv4.
  - add functions to get unused ports over ipv6.
- Testing binaries
  - Updated code to reflect network config changes and unused_port changes.

## Additional Info

TODOs:
- use two sockets in discovery. I'll get back to this and it's on https://github.com/sigp/discv5/pull/160
- lcli allow listening over two sockets in generate_bootnodes_enr
- add at least one smoke flag for ipv6 (I have tested this and works for me)
- update the book
2023-03-14 01:13:34 +00:00
Michael Sproul
90cef1db86 Appease Clippy 1.68 and refactor http_api (#4068)
## Proposed Changes

Two tiny updates to satisfy Clippy 1.68

Plus refactoring of the `http_api` into less complex types so the compiler can chew and digest them more easily.

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-03-13 01:40:03 +00:00
Michael Sproul
01556f6f01 Optimise payload attributes calculation and add SSE (#4027)
## Issue Addressed

Closes #3896
Closes #3998
Closes #3700

## Proposed Changes

- Optimise the calculation of withdrawals for payload attributes by avoiding state clones, avoiding unnecessary state advances and reading from the snapshot cache if possible.
- Use the execution layer's payload attributes cache to avoid re-calculating payload attributes. I actually implemented a new LRU cache just for withdrawals but it had the exact same key and most of the same data as the existing payload attributes cache, so I deleted it.
- Add a new SSE event that fires when payloadAttributes are calculated. This is useful for block builders, a la https://github.com/ethereum/beacon-APIs/issues/244.
- Add a new CLI flag `--always-prepare-payload` which forces payload attributes to be sent with every fcU regardless of connected proposers. This is intended for use by builders/relays.

For maximum effect, the flags I've been using to run Lighthouse in "payload builder mode" are:

```
--always-prepare-payload \
--prepare-payload-lookahead 12000 \
--suggested-fee-recipient 0x0000000000000000000000000000000000000000
```

The fee recipient is required so Lighthouse has something to pack in the payload attributes (it can be ignored by the builder). The lookahead causes fcU to be sent at the start of every slot rather than at 8s. As usual, fcU will also be sent after each change of head block. I think this combination is sufficient for builders to build on all viable heads. Often there will be two fcU (and two payload attributes) sent for the same slot: one sent at the start of the slot with the head from `n - 1` as the parent, and one sent after the block arrives with `n` as the parent.

Example usage of the new event stream:

```bash
curl -N "http://localhost:5052/eth/v1/events?topics=payload_attributes"
```

## Additional Info

- [x] Tests added by updating the proposer re-org tests. This has the benefit of testing the proposer re-org code paths with withdrawals too, confirming that the new changes don't interact poorly.
- [ ] Benchmarking with `blockdreamer` on devnet-7 showed promising results but I'm yet to do a comparison to `unstable`.


Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-03-05 23:43:30 +00:00
ethDreamer
e743d75c9b
Update Mock Builder for Post-Capella Tests (#3958)
* Update Mock Builder for Post-Capella Tests

* Add _mut Suffix to BidStuff Functions

* Fix Setting Gas Limit
2023-02-10 13:30:14 -06:00
Paul Hauner
e062a7cf76
Broadcast address changes at Capella (#3919)
* Add first efforts at broadcast

* Tidy

* Move broadcast code to client

* Progress with broadcast impl

* Rename to address change

* Fix compile errors

* Use `while` loop

* Tidy

* Flip broadcast condition

* Switch to forgetting individual indices

* Always broadcast when the node starts

* Refactor into two functions

* Add testing

* Add another test

* Tidy, add more testing

* Tidy

* Add test, rename enum

* Rename enum again

* Tidy

* Break loop early

* Add V15 schema migration

* Bump schema version

* Progress with migration

* Update beacon_node/client/src/address_change_broadcast.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Fix typo in function name

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-02-07 17:13:49 +11:00
ethDreamer
90b6ae62e6
Use Local Payload if More Profitable than Builder (#3934)
* Use Local Payload if More Profitable than Builder

* Rename clone -> clone_from_ref

* Minimize Clones of GetPayloadResponse

* Cleanup & Fix Tests

* Added Tests for Payload Choice by Profit

* Fix Outdated Comments
2023-02-01 19:37:46 -06:00
Michael Sproul
16bdb2771b
Update another test broken by the shuffling change 2023-01-25 16:18:00 +11:00
Michael Sproul
e48487db01
Fix the new BLS to execution change test 2023-01-25 15:47:07 +11:00
Michael Sproul
d8abf2fc41
Import BLS to execution changes before Capella (#3892)
* Import BLS to execution changes before Capella

* Test for BLS to execution change HTTP API

* Pack BLS to execution changes in LIFO order

* Remove unused var

* Clippy
2023-01-21 10:39:59 +11:00
Michael Sproul
2af8110529
Merge remote-tracking branch 'origin/unstable' into capella
Fixing the conflicts involved patching up some of the `block_hash` verification,
the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870
2023-01-12 16:22:00 +11:00
Age Manning
1d9a2022b4 Upgrade to libp2p v0.50.0 (#3764)
I've needed to do this work in order to do some episub testing. 

This version of libp2p has not yet been released, so this is left as a draft for when we wish to update.

Co-authored-by: Diva M <divma@protonmail.com>
2023-01-06 15:59:33 +00:00
Mark Mackey
c922566fbc
Fixed Some Tests 2022-12-27 15:59:34 -06:00
Michael Sproul
991e4094f8
Merge remote-tracking branch 'origin/unstable' into capella-update 2022-12-14 13:00:41 +11:00
Michael Sproul
775d222299 Enable proposer boost re-orging (#2860)
## Proposed Changes

With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: https://github.com/ethereum/consensus-specs/pull/2353#issuecomment-950238004

There is also a specification for this feature here: https://github.com/ethereum/consensus-specs/pull/3034

Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: pawan <pawandhananjay@gmail.com>
2022-12-13 09:57:26 +00:00
ethDreamer
b6486e809d
Fixed moar tests (#3774) 2022-12-05 09:08:55 +11:00
Age Manning
230168deff Health Endpoints for UI (#3668)
This PR adds some health endpoints for the beacon node and the validator client.

Specifically it adds the endpoint:
`/lighthouse/ui/health`

These are not entirely stable yet. But provide a base for modification for our UI. 

These also may have issues with various platforms and may need modification.
2022-11-15 05:21:26 +00:00
Michael Sproul
d99bfcf1a5 Blinded block and RANDAO APIs (#3571)
## Issue Addressed

https://github.com/ethereum/beacon-APIs/pull/241
https://github.com/ethereum/beacon-APIs/pull/242

## Proposed Changes

Implement two new endpoints for fetching blinded blocks and RANDAO mixes.


Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-11-11 00:38:27 +00:00
Pawan Dhananjay
8728c40102 Remove fallback support from eth1 service (#3594)
## Issue Addressed

N/A

## Proposed Changes

With https://github.com/sigp/lighthouse/pull/3214 we made it such that you can either have 1 auth endpoint or multiple non auth endpoints. Now that we are post merge on all networks (testnets and mainnet), we cannot progress a chain without a dedicated auth execution layer connection so there is no point in having a non-auth eth1-endpoint for syncing deposit cache. 

This code removes all fallback related code in the eth1 service. We still keep the single non-auth endpoint since it's useful for testing.

## Additional Info

This removes all eth1 fallback related metrics that were relevant for the monitoring service, so we might need to change the api upstream.
2022-10-04 08:33:39 +00:00
Divma
b1d2510d1b Libp2p v0.48.0 upgrade (#3547)
## Issue Addressed

Upgrades libp2p to v.0.47.0. This is the compilation of
- [x] #3495 
- [x] #3497 
- [x] #3491 
- [x] #3546 
- [x] #3553 

Co-authored-by: Age Manning <Age@AgeManning.com>
2022-09-29 01:50:11 +00:00
Paul Hauner
fa6ad1a11a Deduplicate block root computation (#3590)
## Issue Addressed

NA

## Proposed Changes

This PR removes duplicated block root computation.

Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`.

Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations *after* the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times.

## Additional Info

NA
2022-09-23 03:52:42 +00:00
Michael Sproul
f2ac0738d8 Implement skip_randao_verification and blinded block rewards API (#3540)
## Issue Addressed

https://github.com/ethereum/beacon-APIs/pull/222

## Proposed Changes

Update Lighthouse's randao verification API to match the `beacon-APIs` spec. We implemented the API before spec stabilisation, and it changed slightly in the course of review.

Rather than a flag `verify_randao` taking a boolean value, the new API uses a `skip_randao_verification` flag which takes no argument. The new spec also requires the randao reveal to be present and equal to the point-at-infinity when `skip_randao_verification` is set.

I've also updated the `POST /lighthouse/analysis/block_rewards` API to take blinded blocks as input, as the execution payload is irrelevant and we may want to assess blocks produced by builders.

## Additional Info

This is technically a breaking change, but seeing as I suspect I'm the only one using these parameters/APIs, I think we're OK to include this in a patch release.
2022-09-19 07:58:48 +00:00
realbigsean
177aef8f1e Builder profit threshold flag (#3534)
## Issue Addressed

Resolves https://github.com/sigp/lighthouse/issues/3517

## Proposed Changes

Adds a `--builder-profit-threshold <wei value>` flag to the BN. If an external payload's value field is less than this value, the local payload will be used. The value of the local payload will not be checked (it can't really be checked until the engine API is updated to support this).


Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-09-05 04:50:49 +00:00
Paul Hauner
661307dce1 Separate committee subscriptions queue (#3508)
## Issue Addressed

NA

## Proposed Changes

As we've seen on Prater, there seems to be a correlation between these messages

```
WARN Not enough time for a discovery search  subnet_id: ExactSubnet { subnet_id: SubnetId(19), slot: Slot(3742336) }, service: attestation_service
```

... and nodes falling 20-30 slots behind the head for short periods. These nodes are running ~20k Prater validators.

After running some metrics, I can see that the `network_recv` channel is processing ~250k `AttestationSubscribe` messages per minute. It occurred to me that perhaps the `AttestationSubscribe` messages are "washing out" the `SendRequest` and `SendResponse` messages. In this PR I separate the `AttestationSubscribe` and `SyncCommitteeSubscribe` messages into their own queue so the `tokio::select!` in the `NetworkService` can still process the other messages in the `network_recv` channel without necessarily having to clear all the subscription messages first.

~~I've also added filter to the HTTP API to prevent duplicate subscriptions going to the network service.~~

## Additional Info

- Currently being tested on Prater
2022-08-30 05:47:31 +00:00
realbigsean
cb132c622d don't register exited or slashed validators with the builder api (#3473)
## Issue Addressed

#3465

## Proposed Changes

Filter out any validator registrations for validators that are not `active` or `pending`.  I'm adding this filtering the beacon node because all the information is readily available there. In other parts of the VC we are usually sending per-validator requests based on duties from the BN. And duties will only be provided for active validators so we don't have this type of filtering elsewhere in the VC.



Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-08-24 23:34:58 +00:00
Michael Sproul
4e05f19fb5 Serve Bellatrix preset in BN API (#3425)
## Issue Addressed

Resolves #3388
Resolves #2638

## Proposed Changes

- Return the `BellatrixPreset` on `/eth/v1/config/spec` by default.
- Allow users to opt out of this by providing `--http-spec-fork=altair` (unless there's a Bellatrix fork epoch set).
- Add the Altair constants from #2638 and make serving the constants non-optional (the `http-disable-legacy-spec` flag is deprecated).
- Modify the VC to only read the `Config` and not to log extra fields. This prevents it from having to muck around parsing the `ConfigAndPreset` fields it doesn't need.

## Additional Info

This change is backwards-compatible for the VC and the BN, but is marked as a breaking change for the removal of `--http-disable-legacy-spec`.

I tried making `Config` a `superstruct` too, but getting the automatic decoding to work was a huge pain and was going to require a lot of hacks, so I gave up in favour of keeping the default-based approach we have now.
2022-08-10 07:52:59 +00:00
realbigsean
6f13727fbe Don't use the builder network if the head is optimistic (#3412)
## Issue Addressed

Resolves https://github.com/sigp/lighthouse/issues/3394

Adds a check in `is_healthy` about whether the head is optimistic when choosing whether to use the builder network. 



Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-08-09 06:05:16 +00:00
Mac L
e24552d61a Restore backwards compatibility when using older BNs (#3410)
## Issue Addressed

https://github.com/status-im/nimbus-eth2/issues/3930

## Proposed Changes

We can trivially support beacon nodes which do not provide the `is_optimistic` field by wrapping the field in an `Option`.
2022-08-02 23:20:51 +00:00
realbigsean
6c2d8b2262 Builder Specs v0.2.0 (#3134)
## Issue Addressed

https://github.com/sigp/lighthouse/issues/3091

Extends https://github.com/sigp/lighthouse/pull/3062, adding pre-bellatrix block support on blinded endpoints and allowing the normal proposal flow (local payload construction) on blinded endpoints. This resulted in better fallback logic because the VC will not have to switch endpoints on failure in the BN <> Builder API, the BN can just fallback immediately and without repeating block processing that it shouldn't need to. We can also keep VC fallback from the VC<>BN API's blinded endpoint to full endpoint.

## Proposed Changes

- Pre-bellatrix blocks on blinded endpoints
- Add a new `PayloadCache` to the execution layer
- Better fallback-from-builder logic

## Todos

- [x] Remove VC transition logic
- [x] Add logic to only enable builder flow after Merge transition finalization
- [x] Tests
- [x] Fix metrics
- [x] Rustdocs


Co-authored-by: Mac L <mjladson@pm.me>
Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-07-30 00:22:37 +00:00
Mac L
d316305411 Add is_optimistic to eth/v1/node/syncing response (#3374)
## Issue Addressed

As specified in the [Beacon Chain API specs](https://github.com/ethereum/beacon-APIs/blob/master/apis/node/syncing.yaml#L32-L35) we should return `is_optimistic` as part of the response to a query for the `eth/v1/node/syncing` endpoint.

## Proposed Changes

Compute the optimistic status of the head and add it to the `SyncingData` response.
2022-07-26 08:50:16 +00:00
Mac L
bb5a6d2cca Add execution_optimistic flag to HTTP responses (#3070)
## Issue Addressed

#3031 

## Proposed Changes

Updates the following API endpoints to conform with https://github.com/ethereum/beacon-APIs/pull/190 and https://github.com/ethereum/beacon-APIs/pull/196
- [x] `beacon/states/{state_id}/root` 
- [x] `beacon/states/{state_id}/fork`
- [x] `beacon/states/{state_id}/finality_checkpoints`
- [x] `beacon/states/{state_id}/validators`
- [x] `beacon/states/{state_id}/validators/{validator_id}`
- [x] `beacon/states/{state_id}/validator_balances`
- [x] `beacon/states/{state_id}/committees`
- [x] `beacon/states/{state_id}/sync_committees`
- [x] `beacon/headers`
- [x] `beacon/headers/{block_id}`
- [x] `beacon/blocks/{block_id}`
- [x] `beacon/blocks/{block_id}/root`
- [x] `beacon/blocks/{block_id}/attestations`
- [x] `debug/beacon/states/{state_id}`
- [x] `debug/beacon/heads`
- [x] `validator/duties/attester/{epoch}`
- [x] `validator/duties/proposer/{epoch}`
- [x] `validator/duties/sync/{epoch}`

Updates the following Server-Sent Events:
- [x]  `events?topics=head`
- [x]  `events?topics=block`
- [x]  `events?topics=finalized_checkpoint`
- [x]  `events?topics=chain_reorg`

## Backwards Incompatible
There is a very minor breaking change with the way the API now handles requests to `beacon/blocks/{block_id}/root` and `beacon/states/{state_id}/root` when `block_id` or `state_id` is the `Root` variant of `BlockId` and `StateId` respectively.

Previously a request to a non-existent root would simply echo the root back to the requester:
```
curl "http://localhost:5052/eth/v1/beacon/states/0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/root"
{"data":{"root":"0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}}
```
Now it will return a `404`:
```
curl "http://localhost:5052/eth/v1/beacon/blocks/0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/root"
{"code":404,"message":"NOT_FOUND: beacon block with root 0xaaaa…aaaa","stacktraces":[]}
```

In addition to this is the block root `0x0000000000000000000000000000000000000000000000000000000000000000` previously would return the genesis block. It will now return a `404`:
```
curl "http://localhost:5052/eth/v1/beacon/blocks/0x0000000000000000000000000000000000000000000000000000000000000000"
{"code":404,"message":"NOT_FOUND: beacon block with root 0x0000…0000","stacktraces":[]}
```

## Additional Info
- `execution_optimistic` is always set, and will return `false` pre-Bellatrix. I am also open to the idea of doing something like `#[serde(skip_serializing_if = "Option::is_none")]`.
- The value of `execution_optimistic` is set to `false` where possible. Any computation that is reliant on the `head` will simply use the `ExecutionStatus` of the head (unless the head block is pre-Bellatrix).

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-07-25 08:23:00 +00:00
Paul Hauner
be4e261e74 Use async code when interacting with EL (#3244)
## Overview

This rather extensive PR achieves two primary goals:

1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state.
2. Refactors fork choice, block production and block processing to `async` functions.

Additionally, it achieves:

- Concurrent forkchoice updates to the EL and cache pruning after a new head is selected.
- Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production.
- Concurrent per-block-processing and execution payload verification during block processing.
- The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?):
    - I had to do this to deal with sending blocks into spawned tasks.
    - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones.
    - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap.
    - Avoids cloning *all the blocks* in *every chain segment* during sync.
    - It also has the potential to clean up our code where we need to pass an *owned* block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough 😅)
- The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs.

For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273

## Changes to `canonical_head` and `fork_choice`

Previously, the `BeaconChain` had two separate fields:

```
canonical_head: RwLock<Snapshot>,
fork_choice: RwLock<BeaconForkChoice>
```

Now, we have grouped these values under a single struct:

```
canonical_head: CanonicalHead {
  cached_head: RwLock<Arc<Snapshot>>,
  fork_choice: RwLock<BeaconForkChoice>
} 
```

Apart from ergonomics, the only *actual* change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously.

## Breaking Changes

### The `state` (root) field in the `finalized_checkpoint` SSE event

Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event:

1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`.
4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots.

Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](de2b2801c8/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java (L171-L182)) it uses [`getStateRootFromBlockRoot`](de2b2801c8/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java (L336-L341)) which uses (1).

I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku.

## Notes for Reviewers

I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct.

I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking".

I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run *at least once* means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows *when* it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it.

I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around.

Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2.

You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests:
- Changing tests to be `tokio::async` tests.
- Adding `.await` to fork choice, block processing and block production functions.
- Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`.
- Wrapping `SignedBeaconBlock` in an `Arc`.
- In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant.

I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic.

Co-authored-by: Mac L <mjladson@pm.me>
2022-07-03 05:36:50 +00:00
realbigsean
f6ec44f0dd Register validator api (#3194)
## Issue Addressed

Lays the groundwork for builder API changes by implementing the beacon-API's new `register_validator` endpoint

## Proposed Changes

- Add a routine in the VC that runs on startup (re-try until success), once per epoch or whenever `suggested_fee_recipient` is updated, signing `ValidatorRegistrationData` and sending it to the BN.
  -  TODO: `gas_limit` config options https://github.com/ethereum/builder-specs/issues/17
-  BN only sends VC registration data to builders on demand, but VC registration data *does update* the BN's prepare proposer cache and send an updated fcU to  a local EE. This is necessary for fee recipient consistency between the blinded and full block flow in the event of fallback.  Having the BN only send registration data to builders on demand gives feedback directly to the VC about relay status. Also, since the BN has no ability to sign these messages anyways (so couldn't refresh them if it wanted), and validator registration is independent of the BN head, I think this approach makes sense. 
- Adds upcoming consensus spec changes for this PR https://github.com/ethereum/consensus-specs/pull/2884
  -  I initially applied the bit mask based on a configured application domain.. but I ended up just hard coding it here instead because that's how it's spec'd in the builder repo. 
  -  Should application mask appear in the api?



Co-authored-by: realbigsean <sean@sigmaprime.io>
2022-06-30 00:49:21 +00:00
Michael Sproul
8fa032c8ae Run fork choice before block proposal (#3168)
## Issue Addressed

Upcoming spec change https://github.com/ethereum/consensus-specs/pull/2878

## Proposed Changes

1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block.
2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_.
3. Remove the fork choice run from the state advance timer that occurred before advancing the state.

## Additional Info

### Block Proposal Accuracy

This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue.

### Attestation Accuracy

This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B.

### Why remove the call before the state advance?

If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea).

### Performance

Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes.

### Implementation

Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.
2022-05-20 05:02:11 +00:00
Paul Hauner
38050fa460 Allow TaskExecutor to be used in async tests (#3178)
# Description

Since the `TaskExecutor` currently requires a `Weak<Runtime>`, it's impossible to use it in an async test where the `Runtime` is created outside our scope. Whilst we *could* create a new `Runtime` instance inside the async test, dropping that `Runtime` would cause a panic (you can't drop a `Runtime` in an async context).

To address this issue, this PR creates the `enum Handle`, which supports either:

- A `Weak<Runtime>` (for use in our production code)
- A `Handle` to a runtime (for use in testing)

In theory, there should be no change to the behaviour of our production code (beyond some slightly different descriptions in HTTP 500 errors), or even our tests. If there is no change, you might ask *"why bother?"*. There are two PRs (#3070 and #3175) that are waiting on these fixes to introduce some new tests. Since we've added the EL to the `BeaconChain` (for the merge), we are now doing more async stuff in tests.

I've also added a `RuntimeExecutor` to the `BeaconChainTestHarness`. Whilst that's not immediately useful, it will become useful in the near future with all the new async testing.
2022-05-16 08:35:59 +00:00
François Garillot
3f9e83e840 [refactor] Refactor Option/Result combinators (#3180)
Code simplifications using `Option`/`Result` combinators to make pattern-matches a tad simpler. 
Opinions on these loosely held, happy to adjust in review.

Tool-aided by [comby-rust](https://github.com/huitseeker/comby-rust).
2022-05-16 01:59:47 +00:00