Commit Graph

251 Commits

Author SHA1 Message Date
Paul Hauner
b60304b19f Use BeaconProcessor for API requests (#4462)
## Issue Addressed

NA

## Proposed Changes

Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves:

1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time).
1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued).
1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests).

Presently there are two levels of priorities:

- `Priority::P0`
    - The beacon processor will prioritise these above everything other than importing new blocks.
    - Roughly all validator-sensitive endpoints.
- `Priority::P1`
    - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things.
    - Everything that's not `Priority::P0`
    
The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request:

```
        --http-enable-beacon-processor <BOOLEAN>
            The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to
            "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API
            responses will be executed immediately. [default: true]
```
    
## New CLI Flags

I added some other new CLI flags:

```
        --beacon-processor-aggregate-batch-size <INTEGER>
            Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may
            reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile
            network. [default: 64]
        --beacon-processor-attestation-batch-size <INTEGER>
            Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU
            usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network.
            [default: 64]
        --beacon-processor-max-workers <INTEGER>
            Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource
            consumption. Reducing the value may result in decreased resource usage and diminished performance. The
            default value is the number of logical CPU cores on the host.
        --beacon-processor-reprocess-queue-len <INTEGER>
            Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent
            messages from being dropped while lower values may help protect the node from becoming overwhelmed.
            [default: 12288]
```


I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with.

## Additional Info

I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 

The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response.

I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](efbabe3252). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.
2023-08-08 23:30:15 +00:00
Divma
ff9b09d964 upgrade to libp2p 0.52 (#4431)
## Issue Addressed

Upgrade libp2p to v0.52

## Proposed Changes
- **Workflows**: remove installation of `protoc`
- **Book**: remove installation of `protoc`
- **`Dockerfile`s and `cross`**: remove custom base `Dockerfile` for cross since it's no longer needed. Remove `protoc` from remaining `Dockerfiles`s
- **Upgrade `discv5` to `v0.3.1`:** we have some cool stuff in there: no longer needs `protoc` and faster ip updates on cold start
- **Upgrade `prometheus` to `0.21.0`**, now it no longer needs encoding checks
- **things that look like refactors:** bunch of api types were renamed and need to be accessed in a different (clearer) way
- **Lighthouse network**
	- connection limits is now a behaviour
	- banned peers no longer exist on the swarm level, but at the behaviour level
	- `connection_event_buffer_size` now is handled per connection with a buffer size of 4
	- `mplex` is deprecated and was removed
	- rpc handler now logs the peer to which it belongs

## Additional Info

Tried to keep as much behaviour unchanged as possible. However, there is a great deal of improvements we can do _after_ this upgrade:
- Smart connection limits: Connection limits have been checked only based on numbers, we can now use information about the incoming peer to decide if we want it
- More powerful peer management: Dial attempts from other behaviours can be rejected early
- Incoming connections can be rejected early
- Banning can be returned exclusively to the peer management: We should not get connections to banned peers anymore making use of this
- TCP Nat updates: We might be able to take advantage of confirmed external addresses to check out tcp ports/ips


Co-authored-by: Age Manning <Age@AgeManning.com>
Co-authored-by: Akihito Nakano <sora.akatsuki@gmail.com>
2023-08-02 00:59:34 +00:00
Gua00va
73764d0dd2 Deprecate exchangeTransitionConfiguration functionality (#4517)
## Issue Addressed

Solves #4442 
## Proposed Changes

EL clients log errors if we don't query this endpoint, but they are making releases that remove this error logging. After those are out we can stop calling it, after which point EL teams will remove the endpoint entirely. 
Refer https://hackmd.io/@n0ble/deprecate-exchgTC
2023-07-31 23:51:39 +00:00
Jimmy Chen
fc7f1ba6b9 Phase 0 attestation rewards via Beacon API (#4474)
## Issue Addressed

Addresses #4026.

Beacon-API spec [here](https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Beacon/getAttestationsRewards).

Endpoint: `POST /eth/v1/beacon/rewards/attestations/{epoch}`

This endpoint already supports post-Altair epochs. This PR adds support for phase 0 rewards calculation.

## Proposed Changes

- [x] Attestation rewards API to support phase 0 rewards calculation, re-using logic from `state_processing`. Refactored `get_attestation_deltas` slightly to support computing deltas for a subset of validators.
- [x] Add `inclusion_delay` to `ideal_rewards` (`beacon-API` spec update to follow)
- [x] Add `inactivity` penalties to both `ideal_rewards` and `total_rewards` (`beacon-API` spec update to follow)
- [x] Add tests to compute attestation rewards and compare results with beacon states 

## Additional Notes

- The extra penalty for missing attestations or being slashed during an inactivity leak is currently not included in the API response (for both phase 0 and Altair) in the spec. 
- I went with adding `inactivity` as a separate component rather than combining them with the 4 rewards, because this is how it was grouped in [the phase 0 spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md#get_attestation_deltas). During inactivity leak, all rewards include the optimal reward, and inactivity penalties are calculated separately (see below code snippet from the spec), so it would be quite confusing if we merge them. This would also work better with Altair, because there's no "cancelling" of rewards and inactivity penalties are more separate.
- Altair calculation logic (to include inactivity penalties) to be updated in a follow-up PR.

```python
def get_attestation_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attestation reward/penalty deltas for each validator.
    """
    source_rewards, source_penalties = get_source_deltas(state)
    target_rewards, target_penalties = get_target_deltas(state)
    head_rewards, head_penalties = get_head_deltas(state)
    inclusion_delay_rewards, _ = get_inclusion_delay_deltas(state)
    _, inactivity_penalties = get_inactivity_penalty_deltas(state)

    rewards = [
        source_rewards[i] + target_rewards[i] + head_rewards[i] + inclusion_delay_rewards[i]
        for i in range(len(state.validators))
    ]

    penalties = [
        source_penalties[i] + target_penalties[i] + head_penalties[i] + inactivity_penalties[i]
        for i in range(len(state.validators))
    ]

    return rewards, penalties
```

## Example API Response

<details>
  <summary>Click me</summary>
  
```json
{
  "ideal_rewards": [
    {
      "effective_balance": "1000000000",
      "head": "6638",
      "target": "6638",
      "source": "6638",
      "inclusion_delay": "9783",
      "inactivity": "0"
    },
    {
      "effective_balance": "2000000000",
      "head": "13276",
      "target": "13276",
      "source": "13276",
      "inclusion_delay": "19565",
      "inactivity": "0"
    },
    {
      "effective_balance": "3000000000",
      "head": "19914",
      "target": "19914",
      "source": "19914",
      "inclusion_delay": "29349",
      "inactivity": "0"
    },
    {
      "effective_balance": "4000000000",
      "head": "26553",
      "target": "26553",
      "source": "26553",
      "inclusion_delay": "39131",
      "inactivity": "0"
    },
    {
      "effective_balance": "5000000000",
      "head": "33191",
      "target": "33191",
      "source": "33191",
      "inclusion_delay": "48914",
      "inactivity": "0"
    },
    {
      "effective_balance": "6000000000",
      "head": "39829",
      "target": "39829",
      "source": "39829",
      "inclusion_delay": "58697",
      "inactivity": "0"
    },
    {
      "effective_balance": "7000000000",
      "head": "46468",
      "target": "46468",
      "source": "46468",
      "inclusion_delay": "68480",
      "inactivity": "0"
    },
    {
      "effective_balance": "8000000000",
      "head": "53106",
      "target": "53106",
      "source": "53106",
      "inclusion_delay": "78262",
      "inactivity": "0"
    },
    {
      "effective_balance": "9000000000",
      "head": "59744",
      "target": "59744",
      "source": "59744",
      "inclusion_delay": "88046",
      "inactivity": "0"
    },
    {
      "effective_balance": "10000000000",
      "head": "66383",
      "target": "66383",
      "source": "66383",
      "inclusion_delay": "97828",
      "inactivity": "0"
    },
    {
      "effective_balance": "11000000000",
      "head": "73021",
      "target": "73021",
      "source": "73021",
      "inclusion_delay": "107611",
      "inactivity": "0"
    },
    {
      "effective_balance": "12000000000",
      "head": "79659",
      "target": "79659",
      "source": "79659",
      "inclusion_delay": "117394",
      "inactivity": "0"
    },
    {
      "effective_balance": "13000000000",
      "head": "86298",
      "target": "86298",
      "source": "86298",
      "inclusion_delay": "127176",
      "inactivity": "0"
    },
    {
      "effective_balance": "14000000000",
      "head": "92936",
      "target": "92936",
      "source": "92936",
      "inclusion_delay": "136959",
      "inactivity": "0"
    },
    {
      "effective_balance": "15000000000",
      "head": "99574",
      "target": "99574",
      "source": "99574",
      "inclusion_delay": "146742",
      "inactivity": "0"
    },
    {
      "effective_balance": "16000000000",
      "head": "106212",
      "target": "106212",
      "source": "106212",
      "inclusion_delay": "156525",
      "inactivity": "0"
    },
    {
      "effective_balance": "17000000000",
      "head": "112851",
      "target": "112851",
      "source": "112851",
      "inclusion_delay": "166307",
      "inactivity": "0"
    },
    {
      "effective_balance": "18000000000",
      "head": "119489",
      "target": "119489",
      "source": "119489",
      "inclusion_delay": "176091",
      "inactivity": "0"
    },
    {
      "effective_balance": "19000000000",
      "head": "126127",
      "target": "126127",
      "source": "126127",
      "inclusion_delay": "185873",
      "inactivity": "0"
    },
    {
      "effective_balance": "20000000000",
      "head": "132766",
      "target": "132766",
      "source": "132766",
      "inclusion_delay": "195656",
      "inactivity": "0"
    },
    {
      "effective_balance": "21000000000",
      "head": "139404",
      "target": "139404",
      "source": "139404",
      "inclusion_delay": "205439",
      "inactivity": "0"
    },
    {
      "effective_balance": "22000000000",
      "head": "146042",
      "target": "146042",
      "source": "146042",
      "inclusion_delay": "215222",
      "inactivity": "0"
    },
    {
      "effective_balance": "23000000000",
      "head": "152681",
      "target": "152681",
      "source": "152681",
      "inclusion_delay": "225004",
      "inactivity": "0"
    },
    {
      "effective_balance": "24000000000",
      "head": "159319",
      "target": "159319",
      "source": "159319",
      "inclusion_delay": "234787",
      "inactivity": "0"
    },
    {
      "effective_balance": "25000000000",
      "head": "165957",
      "target": "165957",
      "source": "165957",
      "inclusion_delay": "244570",
      "inactivity": "0"
    },
    {
      "effective_balance": "26000000000",
      "head": "172596",
      "target": "172596",
      "source": "172596",
      "inclusion_delay": "254352",
      "inactivity": "0"
    },
    {
      "effective_balance": "27000000000",
      "head": "179234",
      "target": "179234",
      "source": "179234",
      "inclusion_delay": "264136",
      "inactivity": "0"
    },
    {
      "effective_balance": "28000000000",
      "head": "185872",
      "target": "185872",
      "source": "185872",
      "inclusion_delay": "273918",
      "inactivity": "0"
    },
    {
      "effective_balance": "29000000000",
      "head": "192510",
      "target": "192510",
      "source": "192510",
      "inclusion_delay": "283701",
      "inactivity": "0"
    },
    {
      "effective_balance": "30000000000",
      "head": "199149",
      "target": "199149",
      "source": "199149",
      "inclusion_delay": "293484",
      "inactivity": "0"
    },
    {
      "effective_balance": "31000000000",
      "head": "205787",
      "target": "205787",
      "source": "205787",
      "inclusion_delay": "303267",
      "inactivity": "0"
    },
    {
      "effective_balance": "32000000000",
      "head": "212426",
      "target": "212426",
      "source": "212426",
      "inclusion_delay": "313050",
      "inactivity": "0"
    }
  ],
  "total_rewards": [
    {
      "validator_index": "0",
      "head": "212426",
      "target": "212426",
      "source": "212426",
      "inclusion_delay": "313050",
      "inactivity": "0"
    },
    {
      "validator_index": "32",
      "head": "212426",
      "target": "212426",
      "source": "212426",
      "inclusion_delay": "313050",
      "inactivity": "0"
    },
    {
      "validator_index": "63",
      "head": "-357771",
      "target": "-357771",
      "source": "-357771",
      "inclusion_delay": "0",
      "inactivity": "0"
    }
  ]
}
```
</details>
2023-07-18 01:48:40 +00:00
Jimmy Chen
d4a61756ca CI fix: add retries to eth1 sim tests (#4501)
## Issue Addressed

This PR attempts to workaround the recent frequent eth1 simulator failures caused by missing eth logs from Anvil. 

> FailedToInsertDeposit(NonConsecutive { log_index: 1, expected: 0 })

This usually occurs at the beginning of the tests, and it guarantees a timeout after a few hours if this log shows up, and this is currently causing our CIs to fail quite frequently. 

Example failure here: https://github.com/sigp/lighthouse/actions/runs/5525760195/jobs/10079736914

## Proposed Changes

The quick fix applied here adds a timeout to node startup and restarts the node again.

- Add a 60 seconds timeout to beacon node startup in eth1 simulator tests. It takes ~10 seconds on my machine, but could take longer on CI runners.
- Wrap the startup code in a retry function, that allows for 3 retries before returning an error.

## Additional Info

We should probably raise an issue under the Anvil GitHub repo there so this can be further investigated.
2023-07-17 00:14:18 +00:00
Michael Sproul
5569d97a07 Remove wget dependency (#4497)
## Proposed Changes

Replace `wget` in the EF-tests makefile with `curl`.

On macOS `curl` is pre-installed, and I found myself making this change to avoid installing `wget`.

The `-L` flag is used to follow redirects which is useful if a repo gets renamed, and more similar to `wget`'s default behaviour.
2023-07-17 00:14:16 +00:00
Jimmy Chen
5cd738c882 Use unique arg names for eth1-sim (#4463)
## Issue Addressed

When trying to run `eth1-sim` locally, the simulator doesn't start for me, and panicked due to duplicate arg names for `proposer-nodes` (using same arg names as `nodes`). Not sure why this isn't failing on CI but failing on mine 🤔 

```
thread 'main' panicked at 'Argument short must be unique
thread 'main' panicked at 'Argument long must be unique
```
2023-07-17 00:14:13 +00:00
Jimmy Chen
46be05f728 Cache target attester balances for unrealized FFG progression calculation (#4362)
## Issue Addressed

#4118 

## Proposed Changes

This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this:
- `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.**
- `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**.
- `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
- `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

### Tasks

- [x] Initial cache implementation in `BeaconState`
- [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache`
- [x] Add CLI flag, and disable the optimization by default
- [x] Testing on Goerli & Benchmarking
- [x]  Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](https://github.com/sigp/lighthouse/pull/4362#discussion_r1230877001))
- [x] Add attesting balance metrics



Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
2023-06-30 01:13:06 +00:00
Jack McPherson
1aff082eea Add broadcast validation routes to Beacon Node HTTP API (#4316)
## Issue Addressed

 - #4293 
 - #4264 

## Proposed Changes

*Changes largely follow those suggested in the main issue*.

 - Add new routes to HTTP API
   - `post_beacon_blocks_v2`
   - `post_blinded_beacon_blocks_v2`
 - Add new routes to `BeaconNodeHttpClient`
   - `post_beacon_blocks_v2`
   - `post_blinded_beacon_blocks_v2`
 - Define new Eth2 common types
   - `BroadcastValidation`, enum representing the level of validation to apply to blocks prior to broadcast
   - `BroadcastValidationQuery`, the corresponding HTTP query string type for the above type
 - ~~Define `_checked` variants of both `publish_block` and `publish_blinded_block` that enforce a validation level at a type level~~
 - Add interactive tests to the `bn_http_api_tests` test target covering each validation level (to their own test module, `broadcast_validation_tests`)
   - `beacon/blocks`
       - `broadcast_validation=gossip`
         - Invalid (400)
         - Full Pass (200)
         - Partial Pass (202)
        - `broadcast_validation=consensus`
          - Invalid (400)
          - Only gossip (400)
          - Only consensus pass (i.e., equivocates) (200)
          - Full pass (200)
        - `broadcast_validation=consensus_and_equivocation`
          - Invalid (400)
          - Invalid due to early equivocation (400)
          - Only gossip (400)
          - Only consensus (400)
          - Pass (200)
   - `beacon/blinded_blocks`
       - `broadcast_validation=gossip`
         - Invalid (400)
         - Full Pass (200)
         - Partial Pass (202)
        - `broadcast_validation=consensus`
          - Invalid (400)
          - Only gossip (400)
          - ~~Only consensus pass (i.e., equivocates) (200)~~
          - Full pass (200)
        - `broadcast_validation=consensus_and_equivocation`
          - Invalid (400)
          - Invalid due to early equivocation (400)
          - Only gossip (400)
          - Only consensus (400)
          - Pass (200)
 - Add a new trait, `IntoGossipVerifiedBlock`, which allows type-level guarantees to be made as to gossip validity
 - Modify the structure of the `ObservedBlockProducers` cache from a `(slot, validator_index)` mapping to a `((slot, validator_index), block_root)` mapping
 - Modify `ObservedBlockProducers::proposer_has_been_observed` to return a `SeenBlock` rather than a boolean on success
 - Punish gossip peer (low) for submitting equivocating blocks
 - Rename `BlockError::SlashablePublish` to `BlockError::SlashableProposal`

## Additional Info

This PR contains changes that directly modify how blocks are verified within the client. For more context, consult [comments in-thread](https://github.com/sigp/lighthouse/pull/4316#discussion_r1234724202).


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-06-29 12:02:38 +00:00
Michael Sproul
affea585f4 Remove CountUnrealized (#4357)
## Issue Addressed

Closes #4332

## Proposed Changes

Remove the `CountUnrealized` type, defaulting unrealized justification to _on_. This fixes the #4332 issue by ensuring that importing the same block to fork choice always results in the same outcome.

Finalized sync speed may be slightly impacted by this change, but that is deemed an acceptable trade-off until the optimisation from #4118 is implemented.

TODO:

- [x] Also check that the block isn't a duplicate before importing
2023-06-16 06:44:31 +00:00
Michael Sproul
4af4e98c82 Update Nethermind (#4361)
## Issue Addressed

Nethermind integration tests are failing with a compilation error: https://github.com/sigp/lighthouse/actions/runs/5138586473/jobs/9248079945

This PR updates Nethermind to the latest release to hopefully fix that
2023-06-02 03:17:39 +00:00
Age Manning
aa1ed787e9 Logging via the HTTP API (#4074)
This PR adds the ability to read the Lighthouse logs from the HTTP API for both the BN and the VC. 

This is done in such a way to as minimize any kind of performance hit by adding this feature.

The current design creates a tokio broadcast channel and mixes is into a form of slog drain that combines with our main global logger drain, only if the http api is enabled. 

The drain gets the logs, checks the log level and drops them if they are below INFO. If they are INFO or higher, it sends them via a broadcast channel only if there are users subscribed to the HTTP API channel. If not, it drops the logs. 

If there are more than one subscriber, the channel clones the log records and converts them to json in their independent HTTP API tasks. 

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-05-22 05:57:08 +00:00
Pawan Dhananjay
8a3eb4df9c Replace ganache-cli with anvil (#3555)
## Issue Addressed

N/A

## Proposed Changes

Replace ganache-cli with anvil https://github.com/foundry-rs/foundry/blob/master/anvil/README.md
We can lose all js dependencies in CI as a consequence.

## Additional info
Also changes the ethers-rs version used in the execution layer (for the transaction reconstruction) to a newer one. This was necessary to get use the ethers utils for anvil. The fixed execution engine integration tests should catch any potential issues with the payload reconstruction after #3592 


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-05-15 07:22:02 +00:00
Jimmy Chen
8d9c748025 Fix attestation withdrawals root mismatch (#4249)
## Issue Addressed

Addresses #4234 

## Proposed Changes

- Skip withdrawals processing in an inconsistent state replay. 
- Repurpose `StateRootStrategy`: rename to `StateProcessingStrategy` and always skip withdrawals if using `StateProcessingStrategy::Inconsistent`
- Add a test to reproduce the scenario


Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
2023-05-09 10:48:15 +00:00
Michael Sproul
c11638c36c Split common crates out into their own repos (#3890)
## Proposed Changes

Split out several crates which now exist in separate repos under `sigp`.

- [`ssz` and `ssz_derive`](https://github.com/sigp/ethereum_ssz)
- [`tree_hash` and `tree_hash_derive`](https://github.com/sigp/tree_hash)
- [`ethereum_hashing`](https://github.com/sigp/ethereum_hashing)
- [`ethereum_serde_utils`](https://github.com/sigp/ethereum_serde_utils)
- [`ssz_types`](https://github.com/sigp/ssz_types)

For the published crates see: https://crates.io/teams/github:sigp:crates-io?sort=recent-updates.

## Additional Info

- [x] Need to work out how to handle versioning. I was hoping to do 1.0 versions of several crates, but if they depend on `ethereum-types 0.x` that is not going to work. EDIT: decided to go with 0.5.x versions.
- [x] Need to port several changes from `tree-states`, `capella`, `eip4844` branches to the external repos.
2023-04-28 01:15:40 +00:00
Age Manning
7456e1e8fa Separate BN for block proposals (#4182)
It is a well-known fact that IP addresses for beacon nodes used by specific validators can be de-anonymized. There is an assumed risk that a malicious user may attempt to DOS validators when producing blocks to prevent chain growth/liveness.

Although there are a number of ideas put forward to address this, there a few simple approaches we can take to mitigate this risk.

Currently, a Lighthouse user is able to set a number of beacon-nodes that their validator client can connect to. If one beacon node is taken offline, it can fallback to another. Different beacon nodes can use VPNs or rotate IPs in order to mask their IPs.

This PR provides an additional setup option which further mitigates attacks of this kind.

This PR introduces a CLI flag --proposer-only to the beacon node. Setting this flag will configure the beacon node to run with minimal peers and crucially will not subscribe to subnets or sync committees. Therefore nodes of this kind should not be identified as nodes connected to validators of any kind.

It also introduces a CLI flag --proposer-nodes to the validator client. Users can then provide a number of beacon nodes (which may or may not run the --proposer-only flag) that the Validator client will use for block production and propagation only. If these nodes fail, the validator client will fallback to the default list of beacon nodes.

Users are then able to set up a number of beacon nodes dedicated to block proposals (which are unlikely to be identified as validator nodes) and point their validator clients to produce blocks on these nodes and attest on other beacon nodes. An attack attempting to prevent liveness on the eth2 network would then need to preemptively find and attack the proposer nodes which is significantly more difficult than the default setup.

This is a follow on from: #3328 

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-04-26 01:12:36 +00:00
Jimmy Chen
434386774e Bump Rust version (MSRV) (#4204)
## Issue Addressed

There was a [`VecDeque` bug](https://github.com/rust-lang/rust/issues/108453) in some recent versions of the Rust standard library (1.67.0 & 1.67.1) that could cause Lighthouse to panic (reported by `@Sea Monkey` on discord). See full logs below.

The issue was likely introduced in Rust 1.67.0 and [fixed](https://github.com/rust-lang/rust/pull/108475) in 1.68, and we were able to reproduce the panic ourselves using [@michaelsproul's fuzz tests](https://github.com/michaelsproul/lighthouse/blob/fuzz-lru-time-cache/beacon_node/lighthouse_network/src/peer_manager/fuzz.rs#L111) on both Rust 1.67.0 and 1.67.1. 

Users that uses our Docker images or binaries are unlikely affected, as our Docker images were built with `1.66`, and latest binaries were built with latest stable (`1.68.2`). It likely impacts user that builds from source using Rust versions 1.67.x.

## Proposed Changes

Bump Rust version (MSRV) to latest stable `1.68.2`. 

## Additional Info

From `@Sea Monkey` on Lighthouse Discord:

> Crash on goerli using `unstable` `dd124b2d6804d02e4e221f29387a56775acccd08`

```
thread 'tokio-runtime-worker' panicked at 'Key must exist', /mnt/goerli/goerli/lighthouse/common/lru_cache/src/time.rs:68:28
stack backtrace:
Apr 15 09:37:36.993 WARN Peer sent invalid block in single block lookup, peer_id: 16Uiu2HAm6ZuyJpVpR6y51X4Enbp8EhRBqGycQsDMPX7e5XfPYznG, error: WouldRevertFinalizedSlot { block_slot: Slot(5420212), finalized_slot: Slot(5420224) }, root: 0x10f6…3165, service: sync
   0: rust_begin_unwind
             at /rustc/d5a82bbd26e1ad8b7401f6a718a9c57c96905483/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/d5a82bbd26e1ad8b7401f6a718a9c57c96905483/library/core/src/panicking.rs:64:14
   2: core::panicking::panic_display
             at /rustc/d5a82bbd26e1ad8b7401f6a718a9c57c96905483/library/core/src/panicking.rs:135:5
   3: core::panicking::panic_str
             at /rustc/d5a82bbd26e1ad8b7401f6a718a9c57c96905483/library/core/src/panicking.rs:119:5
   4: core::option::expect_failed
             at /rustc/d5a82bbd26e1ad8b7401f6a718a9c57c96905483/library/core/src/option.rs:1879:5
   5: lru_cache::time::LRUTimeCache<Key>::raw_remove
   6: lighthouse_network::peer_manager::PeerManager<TSpec>::handle_ban_operation
   7: lighthouse_network::peer_manager::PeerManager<TSpec>::handle_score_action
   8: lighthouse_network::peer_manager::PeerManager<TSpec>::report_peer
   9: network::service::NetworkService<T>::spawn_service::{{closure}}
  10: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
  11: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
  12: <futures_util::future::future::flatten::Flatten<Fut,<Fut as core::future::future::Future>::Output> as core::future::future::Future>::poll
  13: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
  14: tokio::runtime::task::core::Core<T,S>::poll
  15: tokio::runtime::task::harness::Harness<T,S>::poll
  16: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  17: tokio::runtime::scheduler::multi_thread::worker::Context::run
  18: tokio::macros::scoped_tls::ScopedKey<T>::set
  19: tokio::runtime::scheduler::multi_thread::worker::run
  20: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
  21: tokio::runtime::task::core::Core<T,S>::poll
  22: tokio::runtime::task::harness::Harness<T,S>::poll
  23: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Apr 15 09:37:37.069 INFO Saved DHT state                         service: network
Apr 15 09:37:37.070 INFO Network service shutdown                service: network
Apr 15 09:37:37.132 CRIT Task panic. This is a bug!              advice: Please check above for a backtrace and notify the developers, message: <none>, task_name: network
Apr 15 09:37:37.132 INFO Internal shutdown received              reason: Panic (fatal error)
Apr 15 09:37:37.133 INFO Shutting down..                         reason: Failure("Panic (fatal error)")
Apr 15 09:37:37.135 WARN Unable to free worker                   error: channel closed, msg: did not free worker, shutdown may be underway
Apr 15 09:37:39.350 INFO Saved beacon chain to disk              service: beacon
Panic (fatal error)
```
2023-04-18 02:47:37 +00:00
Michael Sproul
1d92e3f77c Use efficient payload reconstruction for HTTP API (#4102)
## Proposed Changes

Builds on #4028 to use the new payload bodies methods in the HTTP API as well.

## Caveats

The payloads by range method only works for the finalized chain, so it can't be used in the execution engine integration tests because we try to reconstruct unfinalized payloads there.
2023-04-18 02:47:35 +00:00
Jimmy Chen
4d17fb3af6 CI fix: move download web3signer binary out of build script (#4163)
## Issue Addressed

Attempt to fix #3812 

## Proposed Changes

Move web3signer binary download script out of build script to avoid downloading unless necessary. If this works, it should also reduce the build time for all jobs that runs compilation.
2023-04-06 06:36:21 +00:00
Jimmy Chen
2de3451011 Rate limiting backfill sync (#3936)
## Issue Addressed

#3212 

## Proposed Changes

- Introduce a new `rate_limiting_backfill_queue` - any new inbound backfill work events gets immediately sent to this FIFO queue **without any processing**
- Spawn a `backfill_scheduler` routine that pops a backfill event from the FIFO queue at specified intervals (currently halfway through a slot, or at 6s after slot start for 12s slots) and sends the event to `BeaconProcessor` via a `scheduled_backfill_work_tx` channel
- This channel gets polled last in the `InboundEvents`, and work event received is  wrapped in a `InboundEvent::ScheduledBackfillWork` enum variant, which gets processed immediately or queued by the `BeaconProcessor` (existing logic applies from here)

Diagram comparing backfill processing with / without rate-limiting: 
https://github.com/sigp/lighthouse/issues/3212#issuecomment-1386249922

See this comment for @paulhauner's  explanation and solution: https://github.com/sigp/lighthouse/issues/3212#issuecomment-1384674956

## Additional Info

I've compared this branch (with backfill processing rate limited to to 1 and 3 batches per slot) against the latest stable version. The CPU usage during backfill sync is reduced by ~5% - 20%, more details on this page:

https://hackmd.io/@jimmygchen/SJuVpJL3j

The above testing is done on Goerli (as I don't currently have hardware for Mainnet), I'm guessing the differences are likely to be bigger on mainnet due to block size.

### TODOs

- [x] Experiment with processing multiple batches per slot. (need to think about how to do this for different slot durations)
- [x] Add option to disable rate-limiting, enabed by default.
- [x] (No longer required now we're reusing the reprocessing queue) Complete the `backfill_scheduler` task when backfill sync is completed or not required
2023-04-03 03:02:55 +00:00
Paul Hauner
1f8c17b530 Fork choice modifications and cleanup (#3962)
## Issue Addressed

NA

## Proposed Changes

- Implements https://github.com/ethereum/consensus-specs/pull/3290/
- Bumps `ef-tests` to [v1.3.0-rc.4](https://github.com/ethereum/consensus-spec-tests/releases/tag/v1.3.0-rc.4).

The `CountRealizedFull` concept has been removed and the `--count-unrealized-full` and `--count-unrealized` BN flags now do nothing but log a `WARN` when used.

## Database Migration Debt

This PR removes the `best_justified_checkpoint` from fork choice. This field is persisted on-disk and the correct way to go about this would be to make a DB migration to remove the field. However, in this PR I've simply stubbed out the value with a junk value. I've taken this approach because if we're going to do a DB migration I'd love to remove the `Option`s around the justified and finalized checkpoints on `ProtoNode` whilst we're at it. Those options were added in #2822 which was included in Lighthouse v2.1.0. The options were only put there to handle the migration and they've been set to `Some` ever since v2.1.0. There's no reason to keep them as options anymore.

I started adding the DB migration to this branch but I started to feel like I was bloating this rather critical PR with nice-to-haves. I've kept the partially-complete migration [over in my repo](https://github.com/paulhauner/lighthouse/tree/fc-pr-18-migration) so we can pick it up after this PR is merged.
2023-03-21 07:34:41 +00:00
ethDreamer
65a5eb8292 Reconstruct Payloads using Payload Bodies Methods (#4028)
## Issue Addressed

* #3895 

Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-03-19 23:15:59 +00:00
Divma
e190ebb8a0 Support for Ipv6 (#4046)
## Issue Addressed
Add support for ipv6 and dual stack in lighthouse. 

## Proposed Changes
From an user perspective, now setting an ipv6 address, optionally configuring the ports should feel exactly the same as using an ipv4 address. If listening over both ipv4 and ipv6 then the user needs to:
- use the `--listen-address` two times (ipv4 and ipv6 addresses)
- `--port6` becomes then required
- `--discovery-port6` can now be used to additionally configure the ipv6 udp port

### Rough list of code changes
- Discovery:
  - Table filter and ip mode set to match the listening config. 
  - Ipv6 address, tcp port and udp port set in the ENR builder
  - Reported addresses now check which tcp port to give to libp2p
- LH Network Service:
  - Can listen over Ipv6, Ipv4, or both. This uses two sockets. Using mapped addresses is disabled from libp2p and it's the most compatible option.
- NetworkGlobals:
  - No longer stores udp port since was not used at all. Instead, stores the Ipv4 and Ipv6 TCP ports.
- NetworkConfig:
  - Update names to make it clear that previous udp and tcp ports in ENR were Ipv4
  - Add fields to configure Ipv6 udp and tcp ports in the ENR
  - Include advertised enr Ipv6 address.
  - Add type to model Listening address that's either Ipv4, Ipv6 or both. A listening address includes the ip, udp port and tcp port.
- UPnP:
  - Kept only for ipv4
- Cli flags:
  - `--listen-addresses` now can take up to two values
  - `--port` will apply to ipv4 or ipv6 if only one listening address is given. If two listening addresses are given it will apply only to Ipv4.
  - `--port6` New flag required when listening over ipv4 and ipv6 that applies exclusively to Ipv6.
  - `--discovery-port` will now apply to ipv4 and ipv6 if only one listening address is given.
  - `--discovery-port6` New flag to configure the individual udp port of ipv6 if listening over both ipv4 and ipv6.
  - `--enr-udp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
  - `--enr-udp6-port` Added to configure the enr udp6 field.
  - `--enr-tcp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
  - `--enr-tcp6-port` Added to configure the enr tcp6 field.
  - `--enr-addresses` now can take two values.
  - `--enr-match` updated behaviour.
- Common:
  - rename `unused_port` functions to specify that they are over ipv4.
  - add functions to get unused ports over ipv6.
- Testing binaries
  - Updated code to reflect network config changes and unused_port changes.

## Additional Info

TODOs:
- use two sockets in discovery. I'll get back to this and it's on https://github.com/sigp/discv5/pull/160
- lcli allow listening over two sockets in generate_bootnodes_enr
- add at least one smoke flag for ipv6 (I have tested this and works for me)
- update the book
2023-03-14 01:13:34 +00:00
Michael Sproul
90cef1db86 Appease Clippy 1.68 and refactor http_api (#4068)
## Proposed Changes

Two tiny updates to satisfy Clippy 1.68

Plus refactoring of the `http_api` into less complex types so the compiler can chew and digest them more easily.

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-03-13 01:40:03 +00:00
Paul Hauner
cac3a66be4 Permit a null LVH from an INVALID response to newPayload (#4037)
## Issue Addressed

NA

## Proposed Changes

As discovered in #4034, Lighthouse is not accepting `latest_valid_hash == None` in an `INVALID` response to `newPayload`. The `null`/`None` response *was* illegal at one point, however it was added in https://github.com/ethereum/execution-apis/pull/254.

This PR brings Lighthouse in line with the standard and should fix the root cause of what #4034 patched around.

## Additional Info

NA
2023-03-03 04:12:50 +00:00
Michael Sproul
ca1ce381a9 Delete Kiln and Ropsten configs (#4038)
## Proposed Changes

Remove built-in support for Ropsten and Kiln via the `--network` flag. Both testnets are long dead and deprecated.

This shaves about 30MiB off the binary size, from 135MiB to 103MiB (maxperf), or 165MiB to 135MiB (release).
2023-03-01 06:16:14 +00:00
Divma
047c7544e3 Clean capella (#4019)
## Issue Addressed

Cleans up all the remnants of 4844 in capella. This makes sure when 4844 is reviewed there is nothing we are missing because it got included here 

## Proposed Changes

drop a bomb on every 4844 thing 

## Additional Info

Merge process I did (locally) is as follows:
- squash merge to produce one commit
- in new branch off unstable with the squashed commit create a `git revert HEAD` commit
- merge that new branch onto 4844 with `--strategy ours`
- compare local 4844 to remote 4844 and make sure the diff is empty
- enjoy

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-03-01 03:19:02 +00:00
Paul Hauner
caa6190d4a Use consensus-spec-tests v1.3.0-rc.3 (#4021)
## Issue Addressed

NA

## Proposed Changes

Updates our `ef_tests` to use: https://github.com/ethereum/consensus-specs/releases/tag/v1.3.0-rc.3

This required:

- Skipping a `merkle_proof_validity` test (see #4022)
- Account for the `eip4844` tests changing name to `deneb`
    - My IDE did some Python linting during this change. It seemed simple and nice so I left it there.

## Additional Info

NA
2023-02-28 02:20:51 +00:00
Age Manning
be29394a9d Execution Integration Tests Correction (#4034)
The execution integration tests are currently failing.

This is a quick modification to pin the execution client version to correct the tests.
2023-02-27 21:26:16 +00:00
Michael Sproul
066c27750a
Merge remote-tracking branch 'origin/staging' into capella-update 2023-02-17 12:05:36 +11:00
Michael Sproul
ebf2fec5d0 Fix exec integration tests for Geth v1.11.0 (#3982)
## Proposed Changes

* Bump Go from 1.17 to 1.20. The latest Geth release v1.11.0 requires 1.18 minimum.
* Prevent a cache miss during payload building by using the right fee recipient. This prevents Geth v1.11.0 from building a block with 0 transactions. The payload building mechanism is overhauled in the new Geth to improve the payload every 2s, and the tests were failing because we were falling back on a `getPayload` call with no lookahead due to `get_payload_id` cache miss caused by the mismatched fee recipient. Alternatively we could hack the tests to send `proposer_preparation_data`, but I think the static fee recipient is simpler for now.
* Add support for optionally enabling Lighthouse logs in the integration tests. Enable using `cargo run --release --features logging/test_logger`. This was very useful for debugging.
2023-02-16 23:34:33 +00:00
ethDreamer
7b7595347d
exchangeCapabilities & Capella Readiness Logging (#3918)
* Undo Passing Spec to Engine API

* Utilize engine_exchangeCapabilities

* Add Logging to Indicate Capella Readiness

* Add exchangeCapabilities to mock_execution_layer

* Send Nested Array for engine_exchangeCapabilities

* Use Mutex Instead of RwLock for EngineCapabilities

* Improve Locking to Avoid Deadlock

* Prettier logic for get_engine_capabilities

* Improve Comments

* Update beacon_node/beacon_chain/src/capella_readiness.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/beacon_chain/src/capella_readiness.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/beacon_chain/src/capella_readiness.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/beacon_chain/src/capella_readiness.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/beacon_chain/src/capella_readiness.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/client/src/notifier.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Update beacon_node/execution_layer/src/engine_api/http.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Addressed Michael's Comments

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-01-31 18:26:23 +01:00
Michael Sproul
bb0e99c097 Merge remote-tracking branch 'origin/unstable' into capella 2023-01-21 10:37:26 +11:00
realbigsean
208f531ae7 update antithesis dockerfile (#3883)
Resolves https://github.com/sigp/lighthouse/issues/3879


Co-authored-by: realbigsean <sean@sigmaprime.io>
2023-01-20 00:46:55 +00:00
Mac L
6ac1c5b439 Add CLI flag to specify the format of logs written to the logfile (#3839)
## Proposed Changes

Decouple the stdout and logfile formats by adding the `--logfile-format` CLI flag.
This behaves identically to the existing `--log-format` flag, but instead will only affect the logs written to the logfile.
The `--log-format` flag will no longer have any effect on the contents of the logfile.

## Additional Info
This avoids being a breaking change by causing `logfile-format` to default to the value of `--log-format` if it is not provided. 
This means that users who were previously relying on being able to use a JSON formatted logfile will be able to continue to use `--log-format JSON`. 

Users who want to use JSON on stdout and default logs in the logfile, will need to pass the following flags: `--log-format JSON --logfile-format DEFAULT`
2023-01-16 03:42:10 +00:00
realbigsean
302eaca344
intentionally skip LightClientHeader ssz static tests 2023-01-13 16:15:27 -05:00
realbigsean
35c9e2407b
bump ef-tests 2023-01-13 15:11:46 -05:00
Michael Sproul
2af8110529
Merge remote-tracking branch 'origin/unstable' into capella
Fixing the conflicts involved patching up some of the `block_hash` verification,
the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870
2023-01-12 16:22:00 +11:00
ethDreamer
52c1055fdc
Remove withdrawals-processing feature (#3864)
* Use spec to Determine Supported Engine APIs

* Remove `withdrawals-processing` feature

* Fixed Tests

* Missed Some Spots

* Fixed Another Test

* Stupid Clippy
2023-01-12 15:15:08 +11:00
realbigsean
98b11bbd3f
add historical summaries (#3865)
* add historical summaries

* fix tree hash caching, disable the sanity slots test with fake crypto

* add ssz static HistoricalSummary

* only store historical summaries after capella

* Teach `UpdatePattern` about Capella

* Tidy EF tests

* Clippy

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-01-11 12:40:21 +11:00
Michael Sproul
4bd2b777ec Verify execution block hashes during finalized sync (#3794)
## Issue Addressed

Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible.

## Proposed Changes

I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client.

I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library).

Block hash verification is pretty quick, about 500us per block on my machine (mainnet).

The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL.

## Additional Info

This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it.

I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.
2023-01-09 03:11:59 +00:00
ethDreamer
cb94f639b0
Isolate withdrawals-processing Feature (#3854) 2023-01-09 11:05:28 +11:00
Mark Mackey
8711db2f3b Fix EF Tests 2023-01-04 15:14:43 -06:00
Mark Mackey
c188cde034
merge upstream/unstable 2022-12-28 14:43:25 -06:00
Michael Sproul
59a7a4703c Various CI fixes (#3813)
## Issue Addressed

Closes #3812
Closes #3750
Closes #3705
2022-12-20 01:34:52 +00:00
Mark Mackey
b75ca74222 Removed withdrawals feature flag 2022-12-19 15:38:46 -06:00
Divma
ffbf70e2d9 Clippy lints for rust 1.66 (#3810)
## Issue Addressed
Fixes the new clippy lints for rust 1.66

## Proposed Changes

Most of the changes come from:
- [unnecessary_cast](https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_cast)
- [iter_kv_map](https://rust-lang.github.io/rust-clippy/master/index.html#iter_kv_map)
- [needless_borrow](https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow)

## Additional Info

na
2022-12-16 04:04:00 +00:00
Michael Sproul
558367ab8c
Bounded withdrawals and spec v1.3.0-alpha.2 (#3802) 2022-12-16 09:20:45 +11:00
ethDreamer
07d6ef749a
Fixed Payload Reconstruction Bug (#3796) 2022-12-14 11:49:30 +11:00
ethDreamer
b1c33361ea
Fixed Clippy Complaints & Some Failing Tests (#3791)
* Fixed Clippy Complaints & Some Failing Tests
* Update Dockerfile to Rust-1.65
* EF test file renamed
* Touch up comments based on feedback
2022-12-13 10:50:24 -06:00