lighthouse/beacon_node/network/Cargo.toml
Paul Hauner 84843d67d7 Reduce some EE and builder related ERRO logs to WARN (#3966)
## Issue Addressed

NA

## Proposed Changes

Our `ERRO` stream has been rather noisy since the merge due to some unexpected behaviours of builders and EEs. Now that we've been running post-merge for a while, I think we can drop some of these `ERRO` to `WARN` so we're not "crying wolf".

The modified logs are:

#### `ERRO Execution engine call failed`

I'm seeing this quite frequently on Geth nodes. They seem to timeout when they're busy and it rarely indicates a serious issue. We also have logging across block import, fork choice updating and payload production that raise `ERRO` or `CRIT` when the EE times out, so I think we're not at risk of silencing actual issues.

#### `ERRO "Builder failed to reveal payload"`

In #3775 we reduced this log from `CRIT` to `ERRO` since it's common for builders to fail to reveal the block to the producer directly whilst still broadcasting it to the networ. I think it's worth dropping this to `WARN` since it's rarely interesting.

I elected to stay with `WARN` since I really do wish builders would fulfill their API promises by returning the block to us. Perhaps I'm just being pedantic here, I could be convinced otherwise.

#### `ERRO "Relay error when registering validator(s)"`

It seems like builders and/or mev-boost struggle to handle heavy loads of validator registrations. I haven't observed issues with validators not actually being registered, but I see timeouts on these endpoints many times a day. It doesn't seem like this `ERRO` is worth it.

#### `ERRO Error fetching block for peer     ExecutionLayerErrorPayloadReconstruction`

This means we failed to respond to a peer on the P2P network with a block they requested because of an error in the `execution_layer`. It's very common to see timeouts or incomplete responses on this endpoint whilst the EE is busy and I don't think it's important enough for an `ERRO`. As long as the peer count stays high, I don't think the user needs to be actively concerned about how we're responding to peers.

## Additional Info

NA
2023-02-12 23:14:08 +00:00

53 lines
1.5 KiB
TOML

[package]
name = "network"
version = "0.2.0"
authors = ["Sigma Prime <contact@sigmaprime.io>"]
edition = "2021"
[dev-dependencies]
sloggers = { version = "2.1.1", features = ["json"] }
genesis = { path = "../genesis" }
matches = "0.1.8"
exit-future = "0.2.0"
slog-term = "2.6.0"
slog-async = "2.5.0"
environment = { path = "../../lighthouse/environment" }
[dependencies]
beacon_chain = { path = "../beacon_chain" }
store = { path = "../store" }
lighthouse_network = { path = "../lighthouse_network" }
types = { path = "../../consensus/types" }
slot_clock = { path = "../../common/slot_clock" }
slog = { version = "2.5.2", features = ["max_level_trace"] }
hex = "0.4.2"
eth2_ssz = "0.4.1"
eth2_ssz_types = "0.2.2"
futures = "0.3.7"
error-chain = "0.12.4"
tokio = { version = "1.14.0", features = ["full"] }
tokio-stream = "0.1.3"
smallvec = "1.6.1"
rand = "0.8.5"
fnv = "1.0.7"
rlp = "0.5.0"
lazy_static = "1.4.0"
lighthouse_metrics = { path = "../../common/lighthouse_metrics" }
logging = { path = "../../common/logging" }
task_executor = { path = "../../common/task_executor" }
igd = "0.11.1"
itertools = "0.10.0"
num_cpus = "1.13.0"
lru_cache = { path = "../../common/lru_cache" }
if-addrs = "0.6.4"
strum = "0.24.0"
tokio-util = { version = "0.6.3", features = ["time"] }
derivative = "2.2.0"
delay_map = "0.1.1"
ethereum-types = { version = "0.14.1", optional = true }
execution_layer = { path = "../execution_layer" }
[features]
deterministic_long_lived_attnets = [ "ethereum-types" ]
# default = ["deterministic_long_lived_attnets"]