Commit Graph

774 Commits

Author SHA1 Message Date
Michael Sproul
4c2d4af6cd Make more noise when the EL is broken (#3986)
## Issue Addressed

Closes #3814, replaces #3818.

## Proposed Changes

* Add a WARN log for the case where we are attempting to sync chain segments but can't process them because they're building on an invalid parent. The most common case where we see this is when the execution node database is corrupt, causing sync to stall mysteriously (because we're currently logging the failure only at debug level).
* Additionally I've bumped up the logging for invalid execution payloads to `WARN`. This may result in some duplicate logs as we log errors from the `beacon_chain` and then again from the beacon processor. Invalid payloads and corrupt DBs _should_ be rare enough that this doesn't produce overwhelming log volume.
2023-03-17 00:44:02 +00:00
Divma
3c18e1a3a4
thread blocks and blobs to sync (#4100)
* thread blocks and blobs to sync

* satisfy dead code analysis
2023-03-16 19:20:39 -05:00
Age Manning
3d99ce25f8 Correct a race condition when dialing peers (#4056)
There is a race condition which occurs when multiple discovery queries return at almost the exact same time and they independently contain a useful peer we would like to connect to.

The condition can occur that we can add the same peer to the dial queue, before we get a chance to process the queue. 
This ends up displaying an error to the user: 
```
ERRO Dialing an already dialing peer
```
Although this error is harmless it's not ideal. 

There are two solutions to resolving this:
1. As we decide to dial the peer, we change the state in the peer-db to dialing (before we add it to the queue) which would prevent other requests from adding to the queue. 
2. We prevent duplicates in the dial queue

This PR has opted for 2. because 1. will complicate the code in that we are changing states in non-intuitive places. Although this technically adds a very slight performance cost, its probably a cleaner solution as we can keep the state-changing logic in one place.
2023-03-16 05:44:54 +00:00
realbigsean
b303d2fb7e
lints 2023-03-15 15:32:22 -04:00
Diva M
4a39e43f96
Merge branch 'eip4844' into deneb-free-blobs 2023-03-15 12:26:30 -05:00
Divma
2c9477de43
Fix block and blob coupling in the network context (#4086)
* update docs

* introduce a temp enum to model an adjusted `BlockWrapper` and fix blob coupling

* fix compilation issue

* fix blob coupling in the network context

* review comments
2023-03-15 11:04:45 -05:00
Jimmy Chen
2ef3ebbef3
Update SignedBlobSidecar container (#4078) 2023-03-15 11:03:56 -05:00
Daniel Ramirez Chiquillo
1ec3041673 Remove Router/Processor Code (#4002)
## Issue Addressed

#3938 

## Proposed Changes

- `network::Processor` is deleted and all it's logic is moved to `network::Router`.
- The `network::Router` module is moved to a single file.
- The following functions are deleted: `on_disconnect` `send_status` `on_status_response` `on_blocks_by_root_request` `on_lightclient_bootstrap` `on_blocks_by_range_request` `on_block_gossip` `on_unaggregated_attestation_gossip` `on_aggregated_attestation_gossip` `on_voluntary_exit_gossip` `on_proposer_slashing_gossip` `on_attester_slashing_gossip` `on_sync_committee_signature_gossip` `on_sync_committee_contribution_gossip` `on_light_client_finality_update_gossip` `on_light_client_optimistic_update_gossip`. This deletions are possible because the updated `Router` allows the underlying methods to be called directly.
2023-03-15 01:27:47 +00:00
Diva M
7f2e9b80bb
Merge branch 'unstable' into eip4844 2023-03-14 12:00:32 -05:00
Divma
e190ebb8a0 Support for Ipv6 (#4046)
## Issue Addressed
Add support for ipv6 and dual stack in lighthouse. 

## Proposed Changes
From an user perspective, now setting an ipv6 address, optionally configuring the ports should feel exactly the same as using an ipv4 address. If listening over both ipv4 and ipv6 then the user needs to:
- use the `--listen-address` two times (ipv4 and ipv6 addresses)
- `--port6` becomes then required
- `--discovery-port6` can now be used to additionally configure the ipv6 udp port

### Rough list of code changes
- Discovery:
  - Table filter and ip mode set to match the listening config. 
  - Ipv6 address, tcp port and udp port set in the ENR builder
  - Reported addresses now check which tcp port to give to libp2p
- LH Network Service:
  - Can listen over Ipv6, Ipv4, or both. This uses two sockets. Using mapped addresses is disabled from libp2p and it's the most compatible option.
- NetworkGlobals:
  - No longer stores udp port since was not used at all. Instead, stores the Ipv4 and Ipv6 TCP ports.
- NetworkConfig:
  - Update names to make it clear that previous udp and tcp ports in ENR were Ipv4
  - Add fields to configure Ipv6 udp and tcp ports in the ENR
  - Include advertised enr Ipv6 address.
  - Add type to model Listening address that's either Ipv4, Ipv6 or both. A listening address includes the ip, udp port and tcp port.
- UPnP:
  - Kept only for ipv4
- Cli flags:
  - `--listen-addresses` now can take up to two values
  - `--port` will apply to ipv4 or ipv6 if only one listening address is given. If two listening addresses are given it will apply only to Ipv4.
  - `--port6` New flag required when listening over ipv4 and ipv6 that applies exclusively to Ipv6.
  - `--discovery-port` will now apply to ipv4 and ipv6 if only one listening address is given.
  - `--discovery-port6` New flag to configure the individual udp port of ipv6 if listening over both ipv4 and ipv6.
  - `--enr-udp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
  - `--enr-udp6-port` Added to configure the enr udp6 field.
  - `--enr-tcp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
  - `--enr-tcp6-port` Added to configure the enr tcp6 field.
  - `--enr-addresses` now can take two values.
  - `--enr-match` updated behaviour.
- Common:
  - rename `unused_port` functions to specify that they are over ipv4.
  - add functions to get unused ports over ipv6.
- Testing binaries
  - Updated code to reflect network config changes and unused_port changes.

## Additional Info

TODOs:
- use two sockets in discovery. I'll get back to this and it's on https://github.com/sigp/discv5/pull/160
- lcli allow listening over two sockets in generate_bootnodes_enr
- add at least one smoke flag for ipv6 (I have tested this and works for me)
- update the book
2023-03-14 01:13:34 +00:00
Diva M
ae3e5f73d6
fmt 2023-03-10 11:24:22 -05:00
Divma
140bdd370d
update code paths in the network crate (#4065)
* wip

* fix router

* arc the byroot responses we send

* add placeholder for blob verification

* respond to blobs by range and blobs by root request in the most horrible and gross way ever

* everything in sync is now unimplemented

* fix compiation issues

* http_pi change is very small, just add it

* remove ctrl-c ctrl-v's docs
2023-03-10 16:52:31 +05:30
Divma
545532a883
fix rpc types to free the blobs (#4059)
* rename to follow name in spec

* use roots and indexes

* wip

* fix req/resp types

* move blob identifier to consensus types
2023-03-07 16:28:45 -05:00
Diva M
bf40acd9df
adjust constant to spec values and names 2023-03-06 17:32:40 -05:00
Diva M
f16e82ab2c
Merge branch 'unstable' into eip4844 2023-03-03 14:14:18 -05:00
Diva M
d93753cc88
Merge branch 'unstable' into off-4844 2023-03-02 15:38:00 -05:00
Pawan Dhananjay
5b18fd92cb Cleaner logic for gossip subscriptions for new forks (#4030)
## Issue Addressed

Cleaner resolution for #4006 

## Proposed Changes

We are currently subscribing to core topics of new forks way before the actual fork since we had just a single `CORE_TOPICS` array. This PR separates the core topics for every fork and subscribes to only required topics based on the current fork.
Also adds logic for subscribing to the core topics of a new fork only 2 slots before the fork happens.

2 slots is to give enough time for the gossip meshes to form. 

Currently doesn't add logic to remove topics from older forks in new forks. For e.g. in the coupled 4844 world, we had to remove the `BeaconBlock` topic in favour of `BeaconBlocksAndBlobsSidecar` at the 4844 fork. It should be easy enough to add though. Not adding it because I'm assuming that  #4019 will get merged before this PR and we won't require any deletion logic. Happy to add it regardless though.
2023-03-01 09:22:48 +00:00
Divma
047c7544e3 Clean capella (#4019)
## Issue Addressed

Cleans up all the remnants of 4844 in capella. This makes sure when 4844 is reviewed there is nothing we are missing because it got included here 

## Proposed Changes

drop a bomb on every 4844 thing 

## Additional Info

Merge process I did (locally) is as follows:
- squash merge to produce one commit
- in new branch off unstable with the squashed commit create a `git revert HEAD` commit
- merge that new branch onto 4844 with `--strategy ours`
- compare local 4844 to remote 4844 and make sure the diff is empty
- enjoy

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2023-03-01 03:19:02 +00:00
Paul Hauner
9c81be8ac4
Fix metric (#4020) 2023-02-22 09:46:45 +11:00
Michael Sproul
066c27750a
Merge remote-tracking branch 'origin/staging' into capella-update 2023-02-17 12:05:36 +11:00
Divma
ffeb8b6e05 blacklist tests in windows (#3961)
## Issue Addressed
Windows tests for subscription and unsubscriptions fail in CI sporadically. We usually ignore this failures, so this PR aims to help reduce the failure noise. Associated issue is https://github.com/sigp/lighthouse/issues/3960
2023-02-16 23:34:30 +00:00
realbigsean
b805fa6279
merge with upstream 2023-02-15 14:20:12 -05:00
Emilia Hane
2672cf40bb
Better fix for debug tests 2023-02-15 11:47:56 +01:00
Emilia Hane
13efd47238
fixup! Disable use of system time in tests 2023-02-15 09:20:30 +01:00
Emilia Hane
9e4abc79fb
Comment out tests that use system time 2023-02-14 14:12:50 +01:00
Emilia Hane
73c7ad73b8
Disable use of system time in tests 2023-02-14 13:33:38 +01:00
Michael Sproul
18c8cab4da
Merge remote-tracking branch 'origin/unstable' into capella-merge 2023-02-14 12:07:27 +11:00
realbigsean
d2ecbd942e
fix a couple new lints 2023-02-13 17:13:47 -05:00
realbigsean
cd8757de1c
Revert "make batch size check compile time panic"
This reverts commit 68f2484efc.
2023-02-13 16:51:55 -05:00
realbigsean
68f2484efc
make batch size check compile time panic 2023-02-13 16:51:46 -05:00
realbigsean
4c3561dcaf
make batch size check compile time panic 2023-02-13 16:50:33 -05:00
realbigsean
fc2d07b4e3
allow unused 2023-02-13 16:36:38 -05:00
realbigsean
28702c9d5d
merge upstream, add back get_blobs logic 2023-02-13 16:29:21 -05:00
Paul Hauner
84843d67d7 Reduce some EE and builder related ERRO logs to WARN (#3966)
## Issue Addressed

NA

## Proposed Changes

Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf".

The modified logs are:

#### `ERRO Execution engine call failed`

I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues.

#### `ERRO "Builder failed to reveal payload"`

In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting.

I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise.

#### `ERRO "Relay error when registering validator(s)"`

It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it.

#### `ERRO Error fetching block for peer     ExecutionLayerErrorPayloadReconstruction`

This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers.

## Additional Info

NA
2023-02-12 23:14:08 +00:00
Emilia Hane
4d3ff347a3
Fixes after rebasing eip4844 2023-02-10 15:34:58 +01:00
Emilia Hane
5437dcae9c
Fix conflicts rebasing eip4844 2023-02-10 15:34:58 +01:00
Emilia Hane
7545ae9e9b
fixup! Fix block lookup debug tests 2023-02-10 15:34:46 +01:00
Emilia Hane
6beca6defc
Fix range sync tests 2023-02-10 09:41:24 +01:00
Emilia Hane
e9e198a2b6
Fix conflicts rebasing eip4844 2023-02-10 09:41:23 +01:00
Emilia Hane
d292a3a6a8
Fix conflicts rebasing eip4844 2023-02-10 09:41:23 +01:00
Emilia Hane
09370e70d9
Fix rebase conflicts 2023-02-10 09:41:19 +01:00
Emilia Hane
8365d76277
fixup! Debug tests 2023-02-10 09:39:22 +01:00
Emilia Hane
16cb9cfca2
fixup! Debug tests 2023-02-10 09:39:22 +01:00
Emilia Hane
7220f35ff6
Debug tests 2023-02-10 09:39:21 +01:00
Emilia Hane
995b2715f2
Fix network block_lookups test 2023-02-10 09:39:21 +01:00
Emilia Hane
3676ce78b5
Fix rebase conflicts 2023-02-10 09:39:21 +01:00
Emilia Hane
56c84178f2
Fix conflicts rebasing eip4844 2023-02-08 11:44:44 +01:00
realbigsean
a42d07592c
fix compilation issues after merge 2023-02-07 12:33:29 -05:00
realbigsean
26a296246d
Merge branch 'capella' of https://github.com/sigp/lighthouse into eip4844
# Conflicts:
#	beacon_node/beacon_chain/src/beacon_chain.rs
#	beacon_node/beacon_chain/src/block_verification.rs
#	beacon_node/beacon_chain/src/test_utils.rs
#	beacon_node/execution_layer/src/engine_api.rs
#	beacon_node/execution_layer/src/engine_api/http.rs
#	beacon_node/execution_layer/src/lib.rs
#	beacon_node/execution_layer/src/test_utils/handle_rpc.rs
#	beacon_node/http_api/src/lib.rs
#	beacon_node/http_api/tests/fork_tests.rs
#	beacon_node/network/src/beacon_processor/mod.rs
#	beacon_node/network/src/beacon_processor/work_reprocessing_queue.rs
#	beacon_node/network/src/beacon_processor/worker/sync_methods.rs
#	beacon_node/operation_pool/src/bls_to_execution_changes.rs
#	beacon_node/operation_pool/src/lib.rs
#	beacon_node/operation_pool/src/persistence.rs
#	consensus/serde_utils/src/u256_hex_be_opt.rs
#	testing/antithesis/Dockerfile.libvoidstar
2023-02-07 12:12:56 -05:00
Paul Hauner
e062a7cf76
Broadcast address changes at Capella (#3919)
* Add first efforts at broadcast

* Tidy

* Move broadcast code to client

* Progress with broadcast impl

* Rename to address change

* Fix compile errors

* Use `while` loop

* Tidy

* Flip broadcast condition

* Switch to forgetting individual indices

* Always broadcast when the node starts

* Refactor into two functions

* Add testing

* Add another test

* Tidy, add more testing

* Tidy

* Add test, rename enum

* Rename enum again

* Tidy

* Break loop early

* Add V15 schema migration

* Bump schema version

* Progress with migration

* Update beacon_node/client/src/address_change_broadcast.rs

Co-authored-by: Michael Sproul <micsproul@gmail.com>

* Fix typo in function name

---------

Co-authored-by: Michael Sproul <micsproul@gmail.com>
2023-02-07 17:13:49 +11:00
realbigsean
37e7c1d5c7 keep verification of payloads pre 4844 2023-01-27 17:59:40 +01:00
realbigsean
7c8d97c06e remove unused import 2023-01-25 14:26:01 +01:00
GeemoCandama
f857811e5f light client optimistic update reprocessing (#3799)
Currently there is a race between receiving blocks and receiving light client optimistic updates (in unstable), which results in processing errors. This is a continuation of PR #3693 and seeks to progress on issue #3651

Add the parent_root to ReprocessQueueMessage::BlockImported so we can remove blocks from queue when a block arrives that has the same parent root. We use the parent root as opposed to the block_root because the LightClientOptimisticUpdate does not contain the block_root.

If light_client_optimistic_update.attested_header.canonical_root() != head_block.message().parent_root() then we queue the update. Otherwise we process immediately.
michaelsproul came up with this idea.
The code was heavily based off of the attestation reprocessing.
I have not properly tested this to see if it works as intended.
2023-01-25 14:23:33 +01:00
Michael Sproul
a4cfe50ade Import BLS to execution changes before Capella (#3892)
* Import BLS to execution changes before Capella

* Test for BLS to execution change HTTP API

* Pack BLS to execution changes in LIFO order

* Remove unused var

* Clippy
2023-01-25 14:21:54 +01:00
Age Manning
528f7181bc Improve block delay metrics (#3894)
We recently ran a large-block experiment on the testnet and plan to do a further experiment on mainnet.

Although the metrics recovered from lighthouse nodes were quite useful, I think we could do with greater resolution in the block delay metrics and get some specific values for each block (currently these can be lost to large exponential histogram buckets). 

This PR increases the resolution of the block delay histogram buckets, but also introduces a new metric which records the last block delay. Depending on the polling resolution of the metric server, we can lose some block delay information, however it will always give us a specific value and we will not lose exact data based on poor resolution histogram buckets.
2023-01-25 14:21:53 +01:00
realbigsean
5e8d79891b merge conflict resolution 2023-01-25 11:10:44 +01:00
Michael Sproul
c76a1971cc
Merge remote-tracking branch 'origin/unstable' into capella 2023-01-25 14:20:16 +11:00
GeemoCandama
a7351c00c0 light client optimistic update reprocessing (#3799)
## Issue Addressed
Currently there is a race between receiving blocks and receiving light client optimistic updates (in unstable), which results in processing errors. This is a continuation of PR #3693 and seeks to progress on issue #3651

## Proposed Changes

Add the parent_root to ReprocessQueueMessage::BlockImported so we can remove blocks from queue when a block arrives that has the same parent root. We use the parent root as opposed to the block_root because the LightClientOptimisticUpdate does not contain the block_root.

If light_client_optimistic_update.attested_header.canonical_root() != head_block.message().parent_root() then we queue the update. Otherwise we process immediately.
## Additional Info
michaelsproul came up with this idea.
The code was heavily based off of the attestation reprocessing.
I have not properly tested this to see if it works as intended.
2023-01-24 22:17:50 +00:00
realbigsean
d3240c1ffb fix common issue across blocks by range and blobs by range 2023-01-24 15:42:28 +01:00
realbigsean
18d4faf611 review updates 2023-01-24 15:30:29 +01:00
realbigsean
2225e6ac89 pass in data availability boundary to the get_blobs method 2023-01-24 14:35:07 +01:00
realbigsean
b658cc7aaf simplify checking attester cache for block and blobs. use ResourceUnavailable according to the spec 2023-01-24 10:50:47 +01:00
Emilia Hane
e14550425d
Fix mismatched response bug 2023-01-23 13:23:04 +01:00
Emilia Hane
81a754577d
fixup! Improve error handling 2023-01-21 15:47:33 +01:00
Emilia Hane
f32f08eec0
Fix typo 2023-01-21 14:47:14 +01:00
Emilia Hane
5fc648217d
fixup! Improve error handling 2023-01-21 14:46:24 +01:00
realbigsean
cbd09dc281 finish refactor 2023-01-21 04:48:25 -05:00
Michael Sproul
d8abf2fc41
Import BLS to execution changes before Capella (#3892)
* Import BLS to execution changes before Capella

* Test for BLS to execution change HTTP API

* Pack BLS to execution changes in LIFO order

* Remove unused var

* Clippy
2023-01-21 10:39:59 +11:00
Michael Sproul
bb0e99c097 Merge remote-tracking branch 'origin/unstable' into capella 2023-01-21 10:37:26 +11:00
Emilia Hane
f7eb89ddd9
Improve error handling 2023-01-20 21:16:47 +01:00
realbigsean
c6479444c2
don't send errors when we *correctly* don't have blobs 2023-01-20 21:16:47 +01:00
realbigsean
e1ce4e5b78
make explicity BlobsUnavailable error and handle it directly 2023-01-20 21:16:47 +01:00
realbigsean
f7f64eb007
fix/consolidate some error handling 2023-01-20 21:16:47 +01:00
Emilia Hane
89cb58d17b
Fix typo
Co-authored-by: realbigsean <seananderson33@GMAIL.com>
2023-01-20 21:16:47 +01:00
Emilia Hane
9cc25162e2
Send error message if eip4844 fork disabled
Co-authored-by: realbigsean <seananderson33@GMAIL.com>
2023-01-20 21:16:46 +01:00
Emilia Hane
654e59cbba
Fix rename fn bug
Co-authored-by: realbigsean <seananderson33@GMAIL.com>
2023-01-20 21:16:46 +01:00
Emilia Hane
b4ec4c1ccf
Less strict handling of faulty rpc req params and syntax improvement 2023-01-20 21:16:46 +01:00
Emilia Hane
9445ac70d8
Check data availability boundary in rpc request 2023-01-20 21:16:46 +01:00
realbigsean
3cb8fb7973
block wrapper refactor initial commit 2023-01-20 11:50:16 -05:00
Age Manning
f8a3b3b95a Improve block delay metrics (#3894)
We recently ran a large-block experiment on the testnet and plan to do a further experiment on mainnet.

Although the metrics recovered from lighthouse nodes were quite useful, I think we could do with greater resolution in the block delay metrics and get some specific values for each block (currently these can be lost to large exponential histogram buckets). 

This PR increases the resolution of the block delay histogram buckets, but also introduces a new metric which records the last block delay. Depending on the polling resolution of the metric server, we can lose some block delay information, however it will always give us a specific value and we will not lose exact data based on poor resolution histogram buckets.
2023-01-20 00:46:56 +00:00
realbigsean
ddcd10b194
merge latest capella changes 2023-01-16 09:17:18 -05:00
realbigsean
1319683736
Update gossip_methods.rs 2023-01-13 14:59:03 -05:00
Mark Mackey
05c1291d8a Don't Penalize Early bls_to_execution_change 2023-01-13 12:53:25 -06:00
realbigsean
06f71e8cce
merge capella 2023-01-12 12:51:09 -05:00
Michael Sproul
2af8110529
Merge remote-tracking branch 'origin/unstable' into capella
Fixing the conflicts involved patching up some of the `block_hash` verification,
the rest will be done as part of https://github.com/sigp/lighthouse/issues/3870
2023-01-12 16:22:00 +11:00
realbigsean
438126f19a
merge upstream, fix compile errors 2023-01-11 13:52:58 -05:00
Paul Hauner
830efdb5c2 Improve validator monitor experience for high validator counts (#3728)
## Issue Addressed

NA

## Proposed Changes

Myself and others (#3678) have observed  that when running with lots of validators (e.g., 1000s) the cardinality is too much for Prometheus. I've seen Prometheus instances just grind to a halt when we turn the validator monitor on for our testnet validators (we have 10,000s of Goerli validators). Additionally, the debug log volume can get very high with one log per validator, per attestation.

To address this, the `bn --validator-monitor-individual-tracking-threshold <INTEGER>` flag has been added to *disable* per-validator (i.e., non-aggregated) metrics/logging once the validator monitor exceeds the threshold of validators. The default value is `64`, which is a finger-to-the-wind value. I don't actually know the value at which Prometheus starts to become overwhelmed, but I've seen it work with ~64 validators and I've seen it *not* work with 1000s of validators. A default of `64` seems like it will result in a breaking change to users who are running millions of dollars worth of validators whilst resulting in a no-op for low-validator-count users. I'm open to changing this number, though.

Additionally, this PR starts collecting aggregated Prometheus metrics (e.g., total count of head hits across all validators), so that high-validator-count validators still have some interesting metrics. We already had logging for aggregated values, so nothing has been added there.

I've opted to make this a breaking change since it can be rather damaging to your Prometheus instance to accidentally enable the validator monitor with large numbers of validators. I've crashed a Prometheus instance myself and had a report from another user who's done the same thing.

## Additional Info

NA

## Breaking Changes Note

A new label has been added to the validator monitor Prometheus metrics: `total`. This label tracks the aggregated metrics of all validators in the validator monitor (as opposed to each validator being tracking individually using its pubkey as the label).

Additionally, a new flag has been added to the Beacon Node: `--validator-monitor-individual-tracking-threshold`. The default value is `64`, which means that when the validator monitor is tracking more than 64 validators then it will stop tracking per-validator metrics and only track the `all_validators` metric. It will also stop logging per-validator logs and only emit aggregated logs (the exception being that exit and slashing logs are always emitted).

These changes were introduced in #3728 to address issues with untenable Prometheus cardinality and log volume when using the validator monitor with high validator counts (e.g., 1000s of validators). Users with less than 65 validators will see no change in behavior (apart from the added `all_validators` metric). Users with more than 65 validators who wish to maintain the previous behavior can set something like `--validator-monitor-individual-tracking-threshold 999999`.
2023-01-09 08:18:55 +00:00
Michael Sproul
4bd2b777ec Verify execution block hashes during finalized sync (#3794)
## Issue Addressed

Recent discussions with other client devs about optimistic sync have revealed a conceptual issue with the optimisation implemented in #3738. In designing that feature I failed to consider that the execution node checks the `blockHash` of the execution payload before responding with `SYNCING`, and that omitting this check entirely results in a degradation of the full node's validation. A node omitting the `blockHash` checks could be tricked by a supermajority of validators into following an invalid chain, something which is ordinarily impossible.

## Proposed Changes

I've added verification of the `payload.block_hash` in Lighthouse. In case of failure we log a warning and fall back to verifying the payload with the execution client.

I've used our existing dependency on `ethers_core` for RLP support, and a new dependency on Parity's `triehash` crate for the Merkle patricia trie. Although the `triehash` crate is currently unmaintained it seems like our best option at the moment (it is also used by Reth, and requires vastly less boilerplate than Parity's generic `trie-root` library).

Block hash verification is pretty quick, about 500us per block on my machine (mainnet).

The optimistic finalized sync feature can be disabled using `--disable-optimistic-finalized-sync` which forces full verification with the EL.

## Additional Info

This PR also introduces a new dependency on our [`metastruct`](https://github.com/sigp/metastruct) library, which was perfectly suited to the RLP serialization method. There will likely be changes as `metastruct` grows, but I think this is a good way to start dogfooding it.

I took inspiration from some Parity and Reth code while writing this, and have preserved the relevant license headers on the files containing code that was copied and modified.
2023-01-09 03:11:59 +00:00
Emilia Hane
c44738c77b
Undo response modification in commit 597363d2f 2023-01-06 12:42:21 +01:00
Emilia Hane
74bca46fc2
Fix bug of early termination of batch send 2023-01-06 11:45:13 +01:00
Emilia Hane
597363d2f9
Don't send empty blobs sidecar for blobs by range request 2023-01-05 16:28:59 +01:00
realbigsean
d8f7277beb
cleanup 2022-12-30 11:00:14 -05:00
sean
40c6daa34b add pawan's suggestsion 2022-12-28 18:27:21 +00:00
realbigsean
8a70d80a2f
Revert "Revert "renames, remove , wrap BlockWrapper enum to make descontruction private""
This reverts commit 1931a442dc.
2022-12-28 10:31:18 -05:00
realbigsean
1931a442dc
Revert "renames, remove , wrap BlockWrapper enum to make descontruction private"
This reverts commit 5b3b34a9d7.
2022-12-28 10:30:36 -05:00
realbigsean
5b3b34a9d7
renames, remove , wrap BlockWrapper enum to make descontruction private 2022-12-28 10:28:45 -05:00
realbigsean
502b5e5bf0
unused error lint 2022-12-28 09:32:29 -05:00
Diva M
6bf439befd
Merge branch 'eip4844' into empty-blobs 2022-12-23 17:38:59 -05:00
Divma
240854750c
cleanup: remove unused imports, unusued fields (#3834) 2022-12-23 17:16:10 -05:00
realbigsean
5e11edc612
fix blob validation for empty blobs 2022-12-23 12:47:38 -05:00
Diva M
24087f104d
add the batch type to the Batch's KV 2022-12-23 10:49:46 -05:00
Diva M
901764b8f0
backfill batches need to be of just one epoch 2022-12-23 10:32:59 -05:00
realbigsean
f45d117e73
merge with capella 2022-12-23 10:21:18 -05:00
realbigsean
4d50fa36bc
Merge pull request #3829 from divagant-martian/handle-no-blob-range-response
Handle peers sending no blob when the blob is empty in range responses
2022-12-23 10:15:30 -05:00
Diva M
66f9aa922d
clean up and improvements 2022-12-23 09:52:10 -05:00
Diva M
3643f5cc19
spelling 2022-12-22 17:47:36 -05:00
Diva M
48ff56d9cb
spelling 2022-12-22 17:38:55 -05:00
Diva M
e24f6c93d9
fix ctrl c'd comment 2022-12-22 17:38:16 -05:00
Diva M
fbc147e273
remove unused entry struct 2022-12-22 17:34:01 -05:00
Diva M
cd6655dba9
handle no blobs from peers instead of empty blobs in range requests 2022-12-22 17:30:04 -05:00
realbigsean
61763790d5
Merge pull request #3825 from jimmygchen/small-fixes
Various small fixes to 4844 branch
2022-12-22 17:12:09 -05:00
realbigsean
33d01a7911
miscelaneous fixes on syncing, rpc and responding to peer's sync related requests (#3827)
- there was a bug in responding range blob requests where we would incorrectly label the first slot of an epoch as a non-skipped slot if it were skipped. this bug did not exist in the code for responding to block range request because the logic error was mitigated by defensive coding elsewhere
- there was a bug where a block received during range sync without a corresponding blob (and vice versa) was incorrectly interpreted as a stream termination
- RPC size limit fixes.
- Our blob cache was dead locking so I removed use of it for now.
- Because of our change in finalized sync batch size from 2 to 1 and our transition to using exact epoch boundaries for batches (rather than one slot past the epoch boundary), we need to sync finalized sync to 2 epochs + 1 slot past our peer's finalized slot in order to finalize the chain locally.
- use fork context bytes in rpc methods on both the server and client side
2022-12-21 15:50:51 -05:00
Jimmy Chen
f7bb458c5e Fix incorrect logging 2022-12-22 02:01:11 +11:00
Jimmy Chen
ccfd092845 Fix blob request logging and incorrect enum type 2022-12-22 00:22:37 +11:00
realbigsean
5de4f5b8d0
handle parent blob request edge cases correctly. fix data availability boundary check 2022-12-19 11:39:09 -05:00
Mark Mackey
3e90fb8cae Merge branch 'unstable' into capella 2022-12-15 12:20:03 -06:00
realbigsean
1644978cdb
fix compilation 2022-12-15 10:26:10 -05:00
realbigsean
d893706e0e
merge with capella 2022-12-15 09:33:18 -05:00
Divma
63c74b37f4 send error answering bbrange requests when an error occurrs (#3800)
## Issue Addressed

While testing withdrawals with @ethDreamer we noticed lighthouse is sending empty batches when an error occurs. As LH peer receiving this, we would consider this a low tolerance action because the peer is claiming the batch is right and is empty.

## Proposed Changes
If any kind of error occurs, send a error response instead

## Additional Info
Right now we don't handle such thing as a partial batch with an error. If an error is received, the whole batch is discarded. Because of this it makes little sense to send partial batches that end with an error, so it's better to do the proposed solution instead of sending empty batches.
2022-12-15 00:16:38 +00:00
Michael Sproul
991e4094f8
Merge remote-tracking branch 'origin/unstable' into capella-update 2022-12-14 13:00:41 +11:00
GeemoCandama
1b28ef8a8d Adding light_client gossip topics (#3693)
## Issue Addressed
Implementing the light_client_gossip topics but I'm not there yet.

Which issue # does this PR address?
Partially #3651

## Proposed Changes
Add light client gossip topics.
Please list or describe the changes introduced by this PR.
I'm going to Implement light_client_finality_update and light_client_optimistic_update gossip topics. Currently I've attempted the former and I'm seeking feedback.

## Additional Info
I've only implemented the light_client_finality_update topic because I wanted to make sure I was on the correct path. Also checking that the gossiped LightClientFinalityUpdate is the same as the locally constructed one is not implemented because caching the updates will make this much easier. Could someone give me some feedback on this please? 

Please provide any additional information. For example, future considerations
or information useful for reviewers.

Co-authored-by: GeemoCandama <104614073+GeemoCandama@users.noreply.github.com>
2022-12-13 06:24:51 +00:00
realbigsean
5a42f6b067
range block or block+blob requests 2022-12-07 15:35:46 -05:00
realbigsean
a0d4aecf30
requests block + blob always post eip4844 2022-12-07 15:30:08 -05:00
realbigsean
6d4fb41b84
fix blob slot validation 2022-12-07 13:49:24 -05:00
realbigsean
6c8b1b323b
merge upstream 2022-12-07 12:27:21 -05:00
ethDreamer
1a39976715
Fixed Compiler Warnings & Failing Tests (#3771) 2022-12-03 10:42:12 +11:00
realbigsean
8102a01085
merge with upstream 2022-12-01 11:13:07 -05:00
Mark Mackey
8a04c3428e Merged with unstable 2022-11-30 17:29:10 -06:00
Diva M
979a95d62f
handle unknown parents for block-blob pairs
wip

handle unknown parents for block-blob pairs
2022-11-30 17:21:54 -05:00
realbigsean
2157d91b43
process single block and blob 2022-11-30 11:51:18 -05:00
realbigsean
fc9d0a512d
handle blobs by range requests 2022-11-30 10:02:29 -05:00
realbigsean
422d145902
chain segment processing for blobs 2022-11-30 09:40:15 -05:00
GeemoCandama
3534c85e30 Optimize finalized chain sync by skipping newPayload messages (#3738)
## Issue Addressed

#3704 

## Proposed Changes
Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status.

## Additional Info
I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.
2022-11-29 08:19:27 +00:00
Diva M
e548073602
Merge branch 'blob-syncing' into eip4844-devnet-v3 2022-11-28 15:10:50 -05:00
Diva M
805df307f6
wip 2022-11-28 14:13:12 -05:00
realbigsean
3c9e1abcb7
merge upstream 2022-11-26 10:01:57 -05:00
antondlr
e9bf7f7cc1 remove commas from comma-separated kv pairs (#3737)
## Issue Addressed

Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing
2022-11-25 07:57:10 +00:00
Giulio rebuffo
d5a2de759b Added LightClientBootstrap V1 (#3711)
## Issue Addressed

Partially addresses #3651

## Proposed Changes

Adds server-side support for light_client_bootstrap_v1 topic

## Additional Info

This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.


Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
2022-11-25 05:19:00 +00:00
Michael Sproul
788b337951
Op pool and gossip for BLS to execution changes (#3726) 2022-11-25 07:09:26 +11:00
realbigsean
1222404450
Merge branch 'blob-syncing' of https://github.com/realbigsean/lighthouse into blob-sync-kzg 2022-11-24 07:46:04 -05:00
Divma
bf5005244e
Blob syncing (#24)
* add a rt is_blob_batch

* use the mixed type everywhere

* glue

* more glue

* minor fixes

* fix range tests

* filling in the gaps

* moore filling in the gaps
2022-11-24 07:45:38 -05:00
realbigsean
beddcfaac2
get spec tests working and fix json serialization 2022-11-23 18:30:45 -05:00
Diva M
7ed2d35424
get it to compile 2022-11-21 14:53:33 -05:00
realbigsean
e7ee79185b
add blobs cache and fix some block production 2022-11-21 14:09:06 -05:00
realbigsean
dc87156641
block and blob handling progress 2022-11-19 16:53:34 -05:00
realbigsean
45897ad4e1
remove blob wrapper 2022-11-19 15:18:42 -05:00
Diva M
78c72158c8
toy skelleton of sync changes 2022-11-16 13:53:38 -05:00
realbigsean
7162e5e23b
add a bunch of blob coupling boiler plate, add a blobs by root request 2022-11-15 16:43:56 -05:00
realbigsean
fe04d945cc
make signed block + sidecar consensus spec 2022-11-10 14:22:30 -05:00
Divma
84c7d8cc70 Blocklookup data inconsistencies (#3677)
## Issue Addressed
Closes #3649 

## Proposed Changes

Add a regression test for the data inconsistency, catching the problem in 31e88c5533 [here](https://github.com/sigp/lighthouse/actions/runs/3379894044/jobs/5612044797#step:6:2043).
When a chain is sent for processing, move it to a separate collection and now the test works, yay!

## Additional Info

na
2022-11-07 06:48:34 +00:00