## Proposed Changes
Builds on #4028 to use the new payload bodies methods in the HTTP API as well.
## Caveats
The payloads by range method only works for the finalized chain, so it can't be used in the execution engine integration tests because we try to reconstruct unfinalized payloads there.
## Issue Addressed
NA
## Proposed Changes
Apply two changes to code introduced in #4179:
1. Remove the `ERRO` log for when we error on `proposer_has_been_observed()`. We were seeing a lot of this in our logs for finalized blocks and it's a bit noisy.
1. Use `false` rather than `true` for `proposal_already_known` when there is an error. If a block raises an error in `proposer_has_been_observed()` then the block must be invalid, so we should process (and reject) it now rather than queuing it.
For reference, here is one of the offending `ERRO` logs:
```
ERRO Failed to check observed proposers block_root: 0x5845…878e, source: rpc, error: FinalizedBlock { slot: Slot(5410983), finalized_slot: Slot(5411232) }
```
## Additional Info
NA
## Issue Addressed
NA
## Proposed Changes
Similar to #4181 but without the version bump and a more nuanced fix.
Patches the high CPU usage seen after the Capella fork which was caused by processing exits when there are skip slots.
## Additional Info
~~This is an imperfect solution that will cause us to drop some exits at the fork boundary. This is tracked at #4184.~~
## Proposed Changes
We already make some attempts to avoid processing RPC blocks when a block from the same proposer is already being processed through gossip. This PR strengthens that guarantee by using the existing cache for `observed_block_producers` to inform whether an RPC block's processing should be delayed.
## Proposed Changes
This change attempts to prevent failed re-orgs by:
1. Lowering the re-org cutoff from 2s to 1s. This is informed by a failed re-org attempted by @yorickdowne's node. The failed block was requested in the 1.5-2s window due to a Vouch failure, and failed to propagate to the majority of the network before the attestation deadline at 4s.
2. Allow users to adjust their re-org cutoff depending on observed network conditions and their risk profile. The static 2 second cutoff was too rigid.
3. Add a `--proposer-reorg-disallowed-offsets` flag which can be used to prohibit reorgs at certain slots. This is intended to help workaround an issue whereby reorging blocks at slot 1 are currently taking ~1.6s to propagate on gossip rather than ~500ms. This is suspected to be due to a cache miss in current versions of Prysm, which should be fixed in their next release.
## Additional Info
I'm of two minds about removing the `shuffling_stable` check which checks for blocks at slot 0 in the epoch. If we removed it users would be able to configure Lighthouse to try reorging at slot 0, which likely wouldn't work very well due to interactions with the proposer index cache. I think we could leave it for now and revisit it later.
## Issue Addressed
#3212
## Proposed Changes
- Introduce a new `rate_limiting_backfill_queue` - any new inbound backfill work events gets immediately sent to this FIFO queue **without any processing**
- Spawn a `backfill_scheduler` routine that pops a backfill event from the FIFO queue at specified intervals (currently halfway through a slot, or at 6s after slot start for 12s slots) and sends the event to `BeaconProcessor` via a `scheduled_backfill_work_tx` channel
- This channel gets polled last in the `InboundEvents`, and work event received is wrapped in a `InboundEvent::ScheduledBackfillWork` enum variant, which gets processed immediately or queued by the `BeaconProcessor` (existing logic applies from here)
Diagram comparing backfill processing with / without rate-limiting:
https://github.com/sigp/lighthouse/issues/3212#issuecomment-1386249922
See this comment for @paulhauner's explanation and solution: https://github.com/sigp/lighthouse/issues/3212#issuecomment-1384674956
## Additional Info
I've compared this branch (with backfill processing rate limited to to 1 and 3 batches per slot) against the latest stable version. The CPU usage during backfill sync is reduced by ~5% - 20%, more details on this page:
https://hackmd.io/@jimmygchen/SJuVpJL3j
The above testing is done on Goerli (as I don't currently have hardware for Mainnet), I'm guessing the differences are likely to be bigger on mainnet due to block size.
### TODOs
- [x] Experiment with processing multiple batches per slot. (need to think about how to do this for different slot durations)
- [x] Add option to disable rate-limiting, enabed by default.
- [x] (No longer required now we're reusing the reprocessing queue) Complete the `backfill_scheduler` task when backfill sync is completed or not required
## Issue Addressed
#3708
## Proposed Changes
- Add `is_finalized_block` method to `BeaconChain` in `beacon_node/beacon_chain/src/beacon_chain.rs`.
- Add `is_finalized_state` method to `BeaconChain` in `beacon_node/beacon_chain/src/beacon_chain.rs`.
- Add `fork_and_execution_optimistic_and_finalized` in `beacon_node/http_api/src/state_id.rs`.
- Add `ExecutionOptimisticFinalizedForkVersionedResponse` type in `consensus/types/src/fork_versioned_response.rs`.
- Add `execution_optimistic_finalized_fork_versioned_response`function in `beacon_node/http_api/src/version.rs`.
- Add `ExecutionOptimisticFinalizedResponse` type in `common/eth2/src/types.rs`.
- Add `add_execution_optimistic_finalized` method in `common/eth2/src/types.rs`.
- Update API response methods to include finalized.
- Remove `execution_optimistic_fork_versioned_response`
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
* rename 4844 to deneb
* rename 4844 to deneb
* move excess data gas field
* get EF tests working
* fix ef tests lint
* fix the blob identifier ef test
* fix accessed files ef test script
* get beacon chain tests passing
* introduce availability pending block
* add intoavailableblock trait
* small fixes
* add 'gossip blob cache' and start to clean up processing and transition types
* shard memory blob cache
* Initial commit
* Fix after rebase
* Add gossip verification conditions
* cache cleanup
* general chaos
* extended chaos
* cargo fmt
* more progress
* more progress
* tons of changes, just tryna compile
* everything, everywhere, all at once
* Reprocess an ExecutedBlock on unavailable blobs
* Add sus gossip verification for blobs
* Merge stuff
* Remove reprocessing cache stuff
* lint
* Add a wrapper to allow construction of only valid `AvailableBlock`s
* rename blob arc list to blob list
* merge cleanuo
* Revert "merge cleanuo"
This reverts commit 5e98326878c77528d0c4668c5a4db4a4b0fbaeaa.
* Revert "Revert "merge cleanuo""
This reverts commit 3a4009443a5812b3028abe855079307436dc5419.
* fix rpc methods
* move beacon block and blob to eth2/types
* rename gossip blob cache to data availability checker
* lots of changes
* fix some compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* cargo fmt
* use a common data structure for block import types
* fix availability check on proposal import
* refactor the blob cache and split the block wrapper into two types
* add type conversion for signed block and block wrapper
* fix beacon chain tests and do some renaming, add some comments
* Partial processing (#4)
* move beacon block and blob to eth2/types
* rename gossip blob cache to data availability checker
* lots of changes
* fix some compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* cargo fmt
* use a common data structure for block import types
* fix availability check on proposal import
* refactor the blob cache and split the block wrapper into two types
* add type conversion for signed block and block wrapper
* fix beacon chain tests and do some renaming, add some comments
* cargo update (#6)
---------
Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: realbigsean <seananderson33@gmail.com>
* Update get blobs endpoint to return BlobSidecarList
* Update code comment
* Update blob retrieval to return BlobSidecarList without Arc
* Remove usage of BlobSidecarList type alias to avoid code conflicts
* Add clippy allow exception
## Issue Addressed
NA
## Proposed Changes
- Implements https://github.com/ethereum/consensus-specs/pull/3290/
- Bumps `ef-tests` to [v1.3.0-rc.4](https://github.com/ethereum/consensus-spec-tests/releases/tag/v1.3.0-rc.4).
The `CountRealizedFull` concept has been removed and the `--count-unrealized-full` and `--count-unrealized` BN flags now do nothing but log a `WARN` when used.
## Database Migration Debt
This PR removes the `best_justified_checkpoint` from fork choice. This field is persisted on-disk and the correct way to go about this would be to make a DB migration to remove the field. However, in this PR I've simply stubbed out the value with a junk value. I've taken this approach because if we're going to do a DB migration I'd love to remove the `Option`s around the justified and finalized checkpoints on `ProtoNode` whilst we're at it. Those options were added in #2822 which was included in Lighthouse v2.1.0. The options were only put there to handle the migration and they've been set to `Some` ever since v2.1.0. There's no reason to keep them as options anymore.
I started adding the DB migration to this branch but I started to feel like I was bloating this rather critical PR with nice-to-haves. I've kept the partially-complete migration [over in my repo](https://github.com/paulhauner/lighthouse/tree/fc-pr-18-migration) so we can pick it up after this PR is merged.
This PR enables the user to adjust the shuffling cache size.
This is useful for some HTTP API requests which require re-computing old shufflings. This PR currently optimizes the
beacon/states/{state_id}/committees HTTP API by first checking the cache before re-building shuffling.
If the shuffling is set to a non-default value, then the HTTP API request will also fill the cache when as it constructs new shufflings.
If the CLI flag is not present or the value is set to the default of 16 the default behaviour is observed.
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
NA
## Proposed Changes
When producing a block from a builder, there are two points where we could consider the block "broadcast":
1. When the blinded block is published to the builder.
2. When the un-blinded block is published to the P2P network (this is always *after* the previous step).
Our logging for late block broadcasts was using (2) for builder-blocks, which was creating a lot of false-positive logs. This is because the builder publishes the block on the P2P network themselves before returning it to us and we perform (2). For clarity, the logs were false-positives because we claim that the block was published late by us when it was actually published earlier by the builder.
This PR changes our logging behavior so we do our logging at (1) instead. It also updates our metrics for block broadcast to distinguish between local and builder blocks. I believe the metrics change will be natively compatible with existing Grafana dashboards.
## Additional Info
One could argue that the builder *should* return the block to us faster, however that's not the case. I think it's more important that we don't desensitize users with false-positives.
## Issue Addressed
Closes#3814, replaces #3818.
## Proposed Changes
* Add a WARN log for the case where we are attempting to sync chain segments but can't process them because they're building on an invalid parent. The most common case where we see this is when the execution node database is corrupt, causing sync to stall mysteriously (because we're currently logging the failure only at debug level).
* Additionally I've bumped up the logging for invalid execution payloads to `WARN`. This may result in some duplicate logs as we log errors from the `beacon_chain` and then again from the beacon processor. Invalid payloads and corrupt DBs _should_ be rare enough that this doesn't produce overwhelming log volume.
## Issue Addressed
In #4027 I forgot to add the `parent_block_number` to the payload attributes SSE.
## Proposed Changes
Compute the parent block number while computing the pre-payload attributes. Pass it on to the SSE stream.
## Additional Info
Not essential for v3.5.1 as I suspect most builders don't need the `parent_block_root`. I would like to use it for my dummy no-op builder however.
## Issue Addressed
Add support for ipv6 and dual stack in lighthouse.
## Proposed Changes
From an user perspective, now setting an ipv6 address, optionally configuring the ports should feel exactly the same as using an ipv4 address. If listening over both ipv4 and ipv6 then the user needs to:
- use the `--listen-address` two times (ipv4 and ipv6 addresses)
- `--port6` becomes then required
- `--discovery-port6` can now be used to additionally configure the ipv6 udp port
### Rough list of code changes
- Discovery:
- Table filter and ip mode set to match the listening config.
- Ipv6 address, tcp port and udp port set in the ENR builder
- Reported addresses now check which tcp port to give to libp2p
- LH Network Service:
- Can listen over Ipv6, Ipv4, or both. This uses two sockets. Using mapped addresses is disabled from libp2p and it's the most compatible option.
- NetworkGlobals:
- No longer stores udp port since was not used at all. Instead, stores the Ipv4 and Ipv6 TCP ports.
- NetworkConfig:
- Update names to make it clear that previous udp and tcp ports in ENR were Ipv4
- Add fields to configure Ipv6 udp and tcp ports in the ENR
- Include advertised enr Ipv6 address.
- Add type to model Listening address that's either Ipv4, Ipv6 or both. A listening address includes the ip, udp port and tcp port.
- UPnP:
- Kept only for ipv4
- Cli flags:
- `--listen-addresses` now can take up to two values
- `--port` will apply to ipv4 or ipv6 if only one listening address is given. If two listening addresses are given it will apply only to Ipv4.
- `--port6` New flag required when listening over ipv4 and ipv6 that applies exclusively to Ipv6.
- `--discovery-port` will now apply to ipv4 and ipv6 if only one listening address is given.
- `--discovery-port6` New flag to configure the individual udp port of ipv6 if listening over both ipv4 and ipv6.
- `--enr-udp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
- `--enr-udp6-port` Added to configure the enr udp6 field.
- `--enr-tcp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
- `--enr-tcp6-port` Added to configure the enr tcp6 field.
- `--enr-addresses` now can take two values.
- `--enr-match` updated behaviour.
- Common:
- rename `unused_port` functions to specify that they are over ipv4.
- add functions to get unused ports over ipv6.
- Testing binaries
- Updated code to reflect network config changes and unused_port changes.
## Additional Info
TODOs:
- use two sockets in discovery. I'll get back to this and it's on https://github.com/sigp/discv5/pull/160
- lcli allow listening over two sockets in generate_bootnodes_enr
- add at least one smoke flag for ipv6 (I have tested this and works for me)
- update the book
## Proposed Changes
Two tiny updates to satisfy Clippy 1.68
Plus refactoring of the `http_api` into less complex types so the compiler can chew and digest them more easily.
Co-authored-by: Michael Sproul <michael@sigmaprime.io>