Commit Graph

514 Commits

Author SHA1 Message Date
Pawan Dhananjay
b276af98b7
Rework block processing (#4092)
* introduce availability pending block

* add intoavailableblock trait

* small fixes

* add 'gossip blob cache' and start to clean up processing and transition types

* shard memory blob cache

* Initial commit

* Fix after rebase

* Add gossip verification conditions

* cache cleanup

* general chaos

* extended chaos

* cargo fmt

* more progress

* more progress

* tons of changes, just tryna compile

* everything, everywhere, all at once

* Reprocess an ExecutedBlock on unavailable blobs

* Add sus gossip verification for blobs

* Merge stuff

* Remove reprocessing cache stuff

* lint

* Add a wrapper to allow construction of only valid `AvailableBlock`s

* rename blob arc list to blob list

* merge cleanuo

* Revert "merge cleanuo"

This reverts commit 5e98326878c77528d0c4668c5a4db4a4b0fbaeaa.

* Revert "Revert "merge cleanuo""

This reverts commit 3a4009443a5812b3028abe855079307436dc5419.

* fix rpc methods

* move beacon block and blob to eth2/types

* rename gossip blob cache to data availability checker

* lots of changes

* fix some compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* cargo fmt

* use a common data structure for block import types

* fix availability check on proposal import

* refactor the blob cache and split the block wrapper into two types

* add type conversion for signed block and block wrapper

* fix beacon chain tests and do some renaming, add some comments

* Partial processing (#4)

* move beacon block and blob to eth2/types

* rename gossip blob cache to data availability checker

* lots of changes

* fix some compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* fix compilation issues

* cargo fmt

* use a common data structure for block import types

* fix availability check on proposal import

* refactor the blob cache and split the block wrapper into two types

* add type conversion for signed block and block wrapper

* fix beacon chain tests and do some renaming, add some comments

* cargo update (#6)

---------

Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: realbigsean <seananderson33@gmail.com>
2023-03-24 17:30:41 -04:00
Diva M
25a2d8f078
Merge branch 'eip4844' into deneb-free-blobs 2023-03-24 14:38:29 -05:00
Diva M
1b9cfcc11b
Merge branch 'unstable' into eip4844 2023-03-24 13:32:50 -05:00
Diva M
7fad926b65
Merge commit '65a5eb829264cb279ed66814c961991ae3a0a04b' into eip4844 2023-03-24 13:24:21 -05:00
Paul Hauner
b2525d6ebd
Release Candidate v4.0.1-rc.0 (#4123) 2023-03-23 21:16:14 +11:00
Paul Hauner
0616e01202 Release v4.0.0 (#4112)
## Issue Addressed

NA

## Proposed Changes

Bump versions to `v4.0.0`

## Additional Info

NA
2023-03-22 04:06:02 +00:00
Age Manning
76a2007b64 Improve Lighthouse Connectivity Via ENR TCP Update (#4057)
Currently Lighthouse will remain uncontactable if users port forward a port that is not the same as the one they are listening on. 

For example, if Lighthouse runs with port 9000 TCP/UDP locally but a router is configured to pass 9010 externally to the lighthouse node on 9000, other nodes on the network will not be able to reach the lighthouse node. 

This occurs because Lighthouse does not update its ENR TCP port on external socket discovery. The intention was always that users should use `--enr-tcp-port` to customise this, but this is non-intuitive. 

The difficulty arises because we have no discovery mechanism to find our external TCP port. If we discovery a new external UDP port, we must guess what our external TCP port might be. This PR assumes the external TCP port is the same as the external UDP port (which may not be the case) and thus updates the TCP port along with the UDP port if the `--enr-tcp-port` flag is not set. 

Along with this PR, will be added documentation to the Lighthouse book so users can correctly understand and configure their ENR to maximize Lighthouse's connectivity. 

This relies on https://github.com/sigp/discv5/pull/166 and we should wait for a new release in discv5 before adding this PR.
2023-03-21 05:14:57 +00:00
ethDreamer
65a5eb8292 Reconstruct Payloads using Payload Bodies Methods (#4028)
## Issue Addressed

* #3895 

Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2023-03-19 23:15:59 +00:00
Diva M
78414333a2
Merge branch 'eip4844' into deneb-free-blobs 2023-03-17 16:39:17 -05:00
Diva M
607242c127
Merge branch 'unstable' into eip4844 2023-03-17 16:26:51 -05:00
Age Manning
3d99ce25f8 Correct a race condition when dialing peers (#4056)
There is a race condition which occurs when multiple discovery queries return at almost the exact same time and they independently contain a useful peer we would like to connect to.

The condition can occur that we can add the same peer to the dial queue, before we get a chance to process the queue. 
This ends up displaying an error to the user: 
```
ERRO Dialing an already dialing peer
```
Although this error is harmless it's not ideal. 

There are two solutions to resolving this:
1. As we decide to dial the peer, we change the state in the peer-db to dialing (before we add it to the queue) which would prevent other requests from adding to the queue. 
2. We prevent duplicates in the dial queue

This PR has opted for 2. because 1. will complicate the code in that we are changing states in non-intuitive places. Although this technically adds a very slight performance cost, its probably a cleaner solution as we can keep the state-changing logic in one place.
2023-03-16 05:44:54 +00:00
Pawan Dhananjay
76f49bdb44
Update kzg interface (#4077)
* Update kzg interface

* Update utils

* Update dependency

* Address review comments
2023-03-14 12:13:15 +05:30
Diva M
203a9e5f5e
Merge branch 'unstable' into eip4844 2023-03-10 11:19:56 -05:00
Paul Hauner
319cc61afe Release v3.5.1 (#4049)
## Issue Addressed

NA

## Proposed Changes

Bumps versions to v3.5.1.

## Additional Info

- [x] Requires further testing
2023-03-07 07:57:39 +00:00
Michael Sproul
4dc0c4c5b7 Update dependencies incl tempfile (#4048)
## Proposed Changes

Fix the cargo audit failure caused by [RUSTSEC-2023-0018](https://rustsec.org/advisories/RUSTSEC-2023-0018) which we were exposed to via `tempfile`.

## Additional Info

I've held back the libp2p crate for now because it seemed to introduce another duplicate dependency on libp2p-core, for a total of 3 copies. Maybe that's fine, but we can sort it out later.
2023-03-05 23:43:32 +00:00
Diva M
d93753cc88
Merge branch 'unstable' into off-4844 2023-03-02 15:38:00 -05:00
Divma
047c7544e3 Clean capella (#4019)
## Issue Addressed

Cleans up all the remnants of 4844 in capella. This makes sure when 4844 is reviewed there is nothing we are missing because it got included here 

## Proposed Changes

drop a bomb on every 4844 thing 

## Additional Info

Merge process I did (locally) is as follows:
- squash merge to produce one commit
- in new branch off unstable with the squashed commit create a `git revert HEAD` commit
- merge that new branch onto 4844 with `--strategy ours`
- compare local 4844 to remote 4844 and make sure the diff is empty
- enjoy

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-03-01 03:19:02 +00:00
Paul Hauner
0fb58a680d v3.5.0 (#3996)
## Issue Addressed

NA

## Proposed Changes

- Bump versions

## Sepolia Capella Upgrade

This release will enable the Capella fork on Sepolia. We are planning to publish this release on the 23rd of Feb 2023.

Users who can build from source and wish to do pre-release testing can use this branch.

## Additional Info

- [ ] Requires further testing
2023-02-22 06:00:49 +00:00
Michael Sproul
fa8b920dd8
Merge branch 'capella' into unstable 2023-02-22 10:25:45 +11:00
Mac L
3642efe76a Cache validator balances and allow them to be served over the HTTP API (#3863)
## Issue Addressed

#3804

## Proposed Changes

- Add `total_balance` to the validator monitor and adjust the number of historical epochs which are cached. 
- Allow certain values in the cache to be served out via the HTTP API without requiring a state read.

## Usage
```
curl -X POST "http://localhost:5052/lighthouse/ui/validator_info" -d '{"indices": [0]}' -H "Content-Type: application/json" | jq
```

```
{
  "data": {
    "validators": {
      "0": {
        "info": [
          {
            "epoch": 172981,
            "total_balance": 36566388519
          },
         ...
          {
            "epoch": 172990,
            "total_balance": 36566496513
          }
        ]
      },
      "1": {
        "info": [
          {
            "epoch": 172981,
            "total_balance": 36355797968
          },
          ...
          {
            "epoch": 172990,
            "total_balance": 36355905962
          }
        ]
      }
    }
  }
}
```

## Additional Info

This requires no historical states to operate which mean it will still function on the freshly checkpoint synced node, however because of this, the values will populate each epoch (up to a maximum of 10 entries).

Another benefit of this method, is that we can easily cache any other values which would normally require a state read and serve them via the same endpoint. However, we would need be cautious about not overly increasing block processing time by caching values from complex computations.

This also caches some of the validator metrics directly, rather than pulling them from the Prometheus metrics when the API is called. This means when the validator count exceeds the individual monitor threshold, the cached values will still be available.

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-02-21 20:54:55 +00:00
Michael Sproul
066c27750a
Merge remote-tracking branch 'origin/staging' into capella-update 2023-02-17 12:05:36 +11:00
Michael Sproul
ebf2fec5d0 Fix exec integration tests for Geth v1.11.0 (#3982)
## Proposed Changes

* Bump Go from 1.17 to 1.20. The latest Geth release v1.11.0 requires 1.18 minimum.
* Prevent a cache miss during payload building by using the right fee recipient. This prevents Geth v1.11.0 from building a block with 0 transactions. The payload building mechanism is overhauled in the new Geth to improve the payload every 2s, and the tests were failing because we were falling back on a `getPayload` call with no lookahead due to `get_payload_id` cache miss caused by the mismatched fee recipient. Alternatively we could hack the tests to send `proposer_preparation_data`, but I think the static fee recipient is simpler for now.
* Add support for optionally enabling Lighthouse logs in the integration tests. Enable using `cargo run --release --features logging/test_logger`. This was very useful for debugging.
2023-02-16 23:34:33 +00:00
realbigsean
4d0b0f681d
merge self limiter 2023-02-15 14:25:58 -05:00
Michael Sproul
2fcfdf1a01 Fix docker and deps (#3978)
## Proposed Changes

- Fix this cargo-audit failure for `sqlite3-sys`: https://github.com/sigp/lighthouse/actions/runs/4179008889/jobs/7238473962
- Prevent the Docker builds from running out of RAM on CI by removing `gnosis` and LMDB support from the `-dev` images (see: https://github.com/sigp/lighthouse/pull/3959#issuecomment-1430531155, successful run on my fork: https://github.com/michaelsproul/lighthouse/actions/runs/4179162480/jobs/7239537947).
2023-02-15 11:51:46 +00:00
Age Manning
8dd9249177 Enforce a timeout on peer disconnect (#3757)
On heavily crowded networks, we are seeing many attempted connections to our node every second. 

Often these connections come from peers that have just been disconnected. This can be for a number of reasons including: 
- We have deemed them to be not as useful as other peers
- They have performed poorly
- They have dropped the connection with us
- The connection was spontaneously lost
- They were randomly removed because we have too many peers

In all of these cases, if we have reached or exceeded our target peer limit, there is no desire to accept new connections immediately after the disconnect from these peers. In fact, it often costs us resources to handle the established connections and defeats some of the logic of dropping them in the first place. 

This PR adds a timeout, that prevents recently disconnected peers from reconnecting to us.

Technically we implement a ban at the swarm layer to prevent immediate re connections for at least 10 minutes. I decided to keep this light, and use a time-based LRUCache which only gets updated during the peer manager heartbeat to prevent added stress of polling a delay map for what could be a large number of peers.

This cache is bounded in time. An extra space bound could be added should people consider this a risk.

Co-authored-by: Diva M <divma@protonmail.com>
2023-02-14 03:25:42 +00:00
Michael Sproul
18c8cab4da
Merge remote-tracking branch 'origin/unstable' into capella-merge 2023-02-14 12:07:27 +11:00
Paul Hauner
84843d67d7 Reduce some EE and builder related ERRO logs to WARN (#3966)
## Issue Addressed

NA

## Proposed Changes

Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf".

The modified logs are:

#### `ERRO Execution engine call failed`

I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues.

#### `ERRO "Builder failed to reveal payload"`

In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting.

I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise.

#### `ERRO "Relay error when registering validator(s)"`

It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it.

#### `ERRO Error fetching block for peer     ExecutionLayerErrorPayloadReconstruction`

This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers.

## Additional Info

NA
2023-02-12 23:14:08 +00:00
ethDreamer
e743d75c9b
Update Mock Builder for Post-Capella Tests (#3958)
* Update Mock Builder for Post-Capella Tests

* Add _mut Suffix to BidStuff Functions

* Fix Setting Gas Limit
2023-02-10 13:30:14 -06:00
Michael Sproul
1cae98856c Update dependencies (#3946)
## Issue Addressed

Resolves the cargo-audit failure caused by https://rustsec.org/advisories/RUSTSEC-2023-0010.

I also removed the ignore for `RUSTSEC-2020-0159` as we are no longer using a vulnerable version of `chrono`. We still need the other ignore for `time 0.1` because we depend on it via `sloggers -> chrono -> time 0.1`.
2023-02-08 02:18:54 +00:00
Diva M
493784366f
self rate limiting 2023-02-07 13:00:35 -05:00
realbigsean
26a296246d
Merge branch 'capella' of https://github.com/sigp/lighthouse into eip4844
# Conflicts:
#	beacon_node/beacon_chain/src/beacon_chain.rs
#	beacon_node/beacon_chain/src/block_verification.rs
#	beacon_node/beacon_chain/src/test_utils.rs
#	beacon_node/execution_layer/src/engine_api.rs
#	beacon_node/execution_layer/src/engine_api/http.rs
#	beacon_node/execution_layer/src/lib.rs
#	beacon_node/execution_layer/src/test_utils/handle_rpc.rs
#	beacon_node/http_api/src/lib.rs
#	beacon_node/http_api/tests/fork_tests.rs
#	beacon_node/network/src/beacon_processor/mod.rs
#	beacon_node/network/src/beacon_processor/work_reprocessing_queue.rs
#	beacon_node/network/src/beacon_processor/worker/sync_methods.rs
#	beacon_node/operation_pool/src/bls_to_execution_changes.rs
#	beacon_node/operation_pool/src/lib.rs
#	beacon_node/operation_pool/src/persistence.rs
#	consensus/serde_utils/src/u256_hex_be_opt.rs
#	testing/antithesis/Dockerfile.libvoidstar
2023-02-07 12:12:56 -05:00
Paul Hauner
e062a7cf76
Broadcast address changes at Capella (#3919)
* Add first efforts at broadcast

* Tidy

* Move broadcast code to client

* Progress with broadcast impl

* Rename to address change

* Fix compile errors

* Use `while` loop

* Tidy

* Flip broadcast condition

* Switch to forgetting individual indices

* Always broadcast when the node starts

* Refactor into two functions

* Add testing

* Add another test

* Tidy, add more testing

* Tidy

* Add test, rename enum

* Rename enum again

* Tidy

* Break loop early

* Add V15 schema migration

* Bump schema version

* Progress with migration

* Update beacon_node/client/src/address_change_broadcast.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Fix typo in function name

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-02-07 17:13:49 +11:00
Michael Sproul
c56706efae Unpin fixed-hash (#3917)
## Proposed Changes
Remove the `[patch]` for `fixed-hash`.

We pinned it years ago in #2710 to fix `arbitrary` support. Nowadays the 0.7 version of `fixed-hash` is only used by the `web3` crate and doesn't need `arbitrary`.

~~Blocked on #3916 but could be merged in the same Bors batch.~~
2023-02-06 04:18:03 +00:00
Michael Sproul
494a270279 Fix the new BLS to execution change test 2023-01-25 14:23:35 +01:00
Michael Sproul
c2f64f8216 Switch allocator to jemalloc (#3697)
## Proposed Changes

Another `tree-states` motivated PR, this adds `jemalloc` as the default allocator, with an option to use the system allocator by compiling with `FEATURES="" make`.

- [x] Metrics
- [x] Test on Windows
- [x] Test on macOS
- [x] Test with `musl`
- [x] Metrics dashboard on `lighthouse-metrics` (https://github.com/sigp/lighthouse-metrics/pull/37)


Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-01-25 14:21:54 +01:00
Michael Sproul
e48487db01
Fix the new BLS to execution change test 2023-01-25 15:47:07 +11:00
Michael Sproul
bb0e99c097 Merge remote-tracking branch 'origin/unstable' into capella 2023-01-21 10:37:26 +11:00
Michael Sproul
4deab888c9 Switch allocator to jemalloc (#3697)
## Proposed Changes

Another `tree-states` motivated PR, this adds `jemalloc` as the default allocator, with an option to use the system allocator by compiling with `FEATURES="" make`.

- [x] Metrics
- [x] Test on Windows
- [x] Test on macOS
- [x] Test with `musl`
- [x] Metrics dashboard on `lighthouse-metrics` (https://github.com/sigp/lighthouse-metrics/pull/37)


Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-01-20 04:19:29 +00:00
realbigsean
acbbdd8b9e
merge eip4844 2023-01-19 08:46:38 -05:00
Pawan Dhananjay
fd08a2cb0a
Fix trusted setup in lcli::new_testnet 2023-01-19 18:16:04 +05:30
Pawan Dhananjay
f04486dc71
Update kzg library to use bytes only interface 2023-01-17 12:12:17 +05:30
realbigsean
d96d793bfb
fix compilation issues 2023-01-12 14:17:14 -05:00
Michael Sproul
2af8110529
Merge remote-tracking branch 'origin/unstable' into capella
Fixing the conflicts involved patching up some of the `block_hash` verification,
the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870
2023-01-12 16:22:00 +11:00
Michael Sproul
56e6b3557a
Fix Arbitrary implementations (#3867)
* Fix Arbitrary implementations

* Remove remaining vestiges of arbitrary-fuzz

* Remove FIXME

* Clippy
2023-01-12 15:17:03 +11:00
Paul Hauner
38514c07f2 Release v3.4.0 (#3862)
## Issue Addressed

NA

## Proposed Changes

Bump versions

## Additional Info

- [x] ~~Blocked on #3728, #3801~~
- [x] ~~Blocked on #3866~~
- [x] Requires additional testing
2023-01-11 03:27:08 +00:00
Michael Sproul
0c74cd4696 Update dependencies incl Tokio (#3866)
## Proposed Changes

Update all dependencies to new semver-compatible releases with `cargo update`. Importantly this patches a Tokio vuln: https://rustsec.org/advisories/RUSTSEC-2023-0001. I don't think we were affected by the vuln because it only applies to named pipes on Windows, but it's still good hygiene to patch.
2023-01-09 23:29:23 +00:00
Pawan Dhananjay
ba410c3012
Embed trusted setup in network config (#3851)
* Load trusted setup in network config

* Fix trusted setup serialize and deserialize

* Load trusted setup from hardcoded preset instead of a file

* Truncate after deserialising trusted setup

* Fix beacon node script

* Remove hardcoded setup file

* Add length checks
2023-01-09 12:34:16 +05:30
Michael Sproul
4bd2b777ec Verify execution block hashes during finalized sync (#3794)
## Issue Addressed

Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible.

## Proposed Changes

I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client.

I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library).

Block hash verification is pretty quick, about 500us per block on my machine (mainnet).

The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL.

## Additional Info

This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it.

I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.
2023-01-09 03:11:59 +00:00
Age Manning
1d9a2022b4 Upgrade to libp2p v0.50.0 (#3764)
I've needed to do this work in order to do some episub testing. 

This version of libp2p has not yet been released, so this is left as a draft for when we wish to update.

Co-authored-by: Diva M <divma@protonmail.com>
2023-01-06 15:59:33 +00:00
Emilia Hane
e9e9fc865b
Cargo clean 2023-01-05 16:28:59 +01:00