This PR adds the ability to read the Lighthouse logs from the HTTP API for both the BN and the VC.
This is done in such a way to as minimize any kind of performance hit by adding this feature.
The current design creates a tokio broadcast channel and mixes is into a form of slog drain that combines with our main global logger drain, only if the http api is enabled.
The drain gets the logs, checks the log level and drops them if they are below INFO. If they are INFO or higher, it sends them via a broadcast channel only if there are users subscribed to the HTTP API channel. If not, it drops the logs.
If there are more than one subscriber, the channel clones the log records and converts them to json in their independent HTTP API tasks.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613.
## Proposed Changes
- Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if:
- The EL's internal status is `Offline` or `AuthFailed`, _or_
- The most recent call to `newPayload` resulted in an error (more on this in a moment).
- Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes.
- Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated.
- Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check.
## Why track `newPayload` errors?
Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because:
- If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)), which would have the effect of masking the failure and keeping the status _online_.
- The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms).
- If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them.
## Additional Changes
- Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too.
- Delete some stale comments from when we used to support multiple execution nodes.
## Issue Addressed
#4233
## Proposed Changes
Remove the `best_justified_checkpoint` from the `PersistedForkChoiceStore` type as it is now unused.
Additionally, remove the `Option`'s wrapping the `justified_checkpoint` and `finalized_checkpoint` fields on `ProtoNode` which were only present to facilitate a previous migration.
Include the necessary code to facilitate the migration to a new DB schema.
## Issue Addressed
#4146
## Proposed Changes
Removes the `ExecutionOptimisticForkVersionedResponse` type and the associated Beacon API endpoint which is now deprecated. Also removes the test associated with the endpoint.
> This is currently a WIP and all features are subject to alteration or removal at any time.
## Overview
The successor to #2873.
Contains the backbone of `beacon.watch` including syncing code, the initial API, and several core database tables.
See `watch/README.md` for more information, requirements and usage.
## Issue Addressed
Addresses #4117
## Proposed Changes
See https://github.com/ethereum/keymanager-APIs/pull/58 for proposed API specification.
## TODO
- [x] ~~Add submission to BN~~
- removed, see discussion in [keymanager API](https://github.com/ethereum/keymanager-APIs/pull/58)
- [x] ~~Add flag to allow voluntary exit via the API~~
- no longer needed now the VC doesn't submit exit directly
- [x] ~~Additional verification / checks, e.g. if validator on same network as BN~~
- to be done on client side
- [x] ~~Potentially wait for the message to propagate and return some exit information in the response~~
- not required
- [x] Update http tests
- [x] ~~Update lighthouse book~~
- not required if this endpoint makes it to the standard keymanager API
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
## Issue Addressed
#3212
## Proposed Changes
- Introduce a new `rate_limiting_backfill_queue` - any new inbound backfill work events gets immediately sent to this FIFO queue **without any processing**
- Spawn a `backfill_scheduler` routine that pops a backfill event from the FIFO queue at specified intervals (currently halfway through a slot, or at 6s after slot start for 12s slots) and sends the event to `BeaconProcessor` via a `scheduled_backfill_work_tx` channel
- This channel gets polled last in the `InboundEvents`, and work event received is wrapped in a `InboundEvent::ScheduledBackfillWork` enum variant, which gets processed immediately or queued by the `BeaconProcessor` (existing logic applies from here)
Diagram comparing backfill processing with / without rate-limiting:
https://github.com/sigp/lighthouse/issues/3212#issuecomment-1386249922
See this comment for @paulhauner's explanation and solution: https://github.com/sigp/lighthouse/issues/3212#issuecomment-1384674956
## Additional Info
I've compared this branch (with backfill processing rate limited to to 1 and 3 batches per slot) against the latest stable version. The CPU usage during backfill sync is reduced by ~5% - 20%, more details on this page:
https://hackmd.io/@jimmygchen/SJuVpJL3j
The above testing is done on Goerli (as I don't currently have hardware for Mainnet), I'm guessing the differences are likely to be bigger on mainnet due to block size.
### TODOs
- [x] Experiment with processing multiple batches per slot. (need to think about how to do this for different slot durations)
- [x] Add option to disable rate-limiting, enabed by default.
- [x] (No longer required now we're reusing the reprocessing queue) Complete the `backfill_scheduler` task when backfill sync is completed or not required
## Issue Addressed
#4127. PR to test port conflicts in CI tests .
## Proposed Changes
See issue for more details, potential solution could be adding a cache bound by time to the `unused_port` function.
## Issue Addressed
#3708
## Proposed Changes
- Add `is_finalized_block` method to `BeaconChain` in `beacon_node/beacon_chain/src/beacon_chain.rs`.
- Add `is_finalized_state` method to `BeaconChain` in `beacon_node/beacon_chain/src/beacon_chain.rs`.
- Add `fork_and_execution_optimistic_and_finalized` in `beacon_node/http_api/src/state_id.rs`.
- Add `ExecutionOptimisticFinalizedForkVersionedResponse` type in `consensus/types/src/fork_versioned_response.rs`.
- Add `execution_optimistic_finalized_fork_versioned_response`function in `beacon_node/http_api/src/version.rs`.
- Add `ExecutionOptimisticFinalizedResponse` type in `common/eth2/src/types.rs`.
- Add `add_execution_optimistic_finalized` method in `common/eth2/src/types.rs`.
- Update API response methods to include finalized.
- Remove `execution_optimistic_fork_versioned_response`
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
Which issue # does this PR address?
https://github.com/sigp/lighthouse/issues/3669
## Proposed Changes
Please list or describe the changes introduced by this PR.
- A new API to fetch fork choice data, as specified [here](https://github.com/ethereum/beacon-APIs/pull/232)
- A new integration test to test the new API
## Additional Info
Please provide any additional information. For example, future considerations
or information useful for reviewers.
- `extra_data` field specified in the beacon-API spec is not implemented, please let me know if I should instead.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
NA
## Proposed Changes
- Bump versions.
- Bump openssl version to resolve various `cargo audit` notices.
## Additional Info
- Requires further testing
* rename 4844 to deneb
* rename 4844 to deneb
* move excess data gas field
* get EF tests working
* fix ef tests lint
* fix the blob identifier ef test
* fix accessed files ef test script
* get beacon chain tests passing
* introduce availability pending block
* add intoavailableblock trait
* small fixes
* add 'gossip blob cache' and start to clean up processing and transition types
* shard memory blob cache
* Initial commit
* Fix after rebase
* Add gossip verification conditions
* cache cleanup
* general chaos
* extended chaos
* cargo fmt
* more progress
* more progress
* tons of changes, just tryna compile
* everything, everywhere, all at once
* Reprocess an ExecutedBlock on unavailable blobs
* Add sus gossip verification for blobs
* Merge stuff
* Remove reprocessing cache stuff
* lint
* Add a wrapper to allow construction of only valid `AvailableBlock`s
* rename blob arc list to blob list
* merge cleanuo
* Revert "merge cleanuo"
This reverts commit 5e98326878c77528d0c4668c5a4db4a4b0fbaeaa.
* Revert "Revert "merge cleanuo""
This reverts commit 3a4009443a5812b3028abe855079307436dc5419.
* fix rpc methods
* move beacon block and blob to eth2/types
* rename gossip blob cache to data availability checker
* lots of changes
* fix some compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* cargo fmt
* use a common data structure for block import types
* fix availability check on proposal import
* refactor the blob cache and split the block wrapper into two types
* add type conversion for signed block and block wrapper
* fix beacon chain tests and do some renaming, add some comments
* Partial processing (#4)
* move beacon block and blob to eth2/types
* rename gossip blob cache to data availability checker
* lots of changes
* fix some compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* fix compilation issues
* cargo fmt
* use a common data structure for block import types
* fix availability check on proposal import
* refactor the blob cache and split the block wrapper into two types
* add type conversion for signed block and block wrapper
* fix beacon chain tests and do some renaming, add some comments
* cargo update (#6)
---------
Co-authored-by: realbigsean <sean@sigmaprime.io>
Co-authored-by: realbigsean <seananderson33@gmail.com>
* Update get blobs endpoint to return BlobSidecarList
* Update code comment
* Update blob retrieval to return BlobSidecarList without Arc
* Remove usage of BlobSidecarList type alias to avoid code conflicts
* Add clippy allow exception
## Issue Addressed
NA
## Proposed Changes
Sets the mainnet Capella fork epoch as per https://github.com/ethereum/consensus-specs/pull/3300
## Additional Info
I expect the `ef_tests` to fail until we get a compatible consensus spec tests release.
Currently Lighthouse will remain uncontactable if users port forward a port that is not the same as the one they are listening on.
For example, if Lighthouse runs with port 9000 TCP/UDP locally but a router is configured to pass 9010 externally to the lighthouse node on 9000, other nodes on the network will not be able to reach the lighthouse node.
This occurs because Lighthouse does not update its ENR TCP port on external socket discovery. The intention was always that users should use `--enr-tcp-port` to customise this, but this is non-intuitive.
The difficulty arises because we have no discovery mechanism to find our external TCP port. If we discovery a new external UDP port, we must guess what our external TCP port might be. This PR assumes the external TCP port is the same as the external UDP port (which may not be the case) and thus updates the TCP port along with the UDP port if the `--enr-tcp-port` flag is not set.
Along with this PR, will be added documentation to the Lighthouse book so users can correctly understand and configure their ENR to maximize Lighthouse's connectivity.
This relies on https://github.com/sigp/discv5/pull/166 and we should wait for a new release in discv5 before adding this PR.
There is a race condition which occurs when multiple discovery queries return at almost the exact same time and they independently contain a useful peer we would like to connect to.
The condition can occur that we can add the same peer to the dial queue, before we get a chance to process the queue.
This ends up displaying an error to the user:
```
ERRO Dialing an already dialing peer
```
Although this error is harmless it's not ideal.
There are two solutions to resolving this:
1. As we decide to dial the peer, we change the state in the peer-db to dialing (before we add it to the queue) which would prevent other requests from adding to the queue.
2. We prevent duplicates in the dial queue
This PR has opted for 2. because 1. will complicate the code in that we are changing states in non-intuitive places. Although this technically adds a very slight performance cost, its probably a cleaner solution as we can keep the state-changing logic in one place.
## Issue Addressed
In #4027 I forgot to add the `parent_block_number` to the payload attributes SSE.
## Proposed Changes
Compute the parent block number while computing the pre-payload attributes. Pass it on to the SSE stream.
## Additional Info
Not essential for v3.5.1 as I suspect most builders don't need the `parent_block_root`. I would like to use it for my dummy no-op builder however.
## Issue Addressed
Add support for ipv6 and dual stack in lighthouse.
## Proposed Changes
From an user perspective, now setting an ipv6 address, optionally configuring the ports should feel exactly the same as using an ipv4 address. If listening over both ipv4 and ipv6 then the user needs to:
- use the `--listen-address` two times (ipv4 and ipv6 addresses)
- `--port6` becomes then required
- `--discovery-port6` can now be used to additionally configure the ipv6 udp port
### Rough list of code changes
- Discovery:
- Table filter and ip mode set to match the listening config.
- Ipv6 address, tcp port and udp port set in the ENR builder
- Reported addresses now check which tcp port to give to libp2p
- LH Network Service:
- Can listen over Ipv6, Ipv4, or both. This uses two sockets. Using mapped addresses is disabled from libp2p and it's the most compatible option.
- NetworkGlobals:
- No longer stores udp port since was not used at all. Instead, stores the Ipv4 and Ipv6 TCP ports.
- NetworkConfig:
- Update names to make it clear that previous udp and tcp ports in ENR were Ipv4
- Add fields to configure Ipv6 udp and tcp ports in the ENR
- Include advertised enr Ipv6 address.
- Add type to model Listening address that's either Ipv4, Ipv6 or both. A listening address includes the ip, udp port and tcp port.
- UPnP:
- Kept only for ipv4
- Cli flags:
- `--listen-addresses` now can take up to two values
- `--port` will apply to ipv4 or ipv6 if only one listening address is given. If two listening addresses are given it will apply only to Ipv4.
- `--port6` New flag required when listening over ipv4 and ipv6 that applies exclusively to Ipv6.
- `--discovery-port` will now apply to ipv4 and ipv6 if only one listening address is given.
- `--discovery-port6` New flag to configure the individual udp port of ipv6 if listening over both ipv4 and ipv6.
- `--enr-udp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
- `--enr-udp6-port` Added to configure the enr udp6 field.
- `--enr-tcp-port` Updated docs to specify that it only applies to ipv4. This is an old behaviour.
- `--enr-tcp6-port` Added to configure the enr tcp6 field.
- `--enr-addresses` now can take two values.
- `--enr-match` updated behaviour.
- Common:
- rename `unused_port` functions to specify that they are over ipv4.
- add functions to get unused ports over ipv6.
- Testing binaries
- Updated code to reflect network config changes and unused_port changes.
## Additional Info
TODOs:
- use two sockets in discovery. I'll get back to this and it's on https://github.com/sigp/discv5/pull/160
- lcli allow listening over two sockets in generate_bootnodes_enr
- add at least one smoke flag for ipv6 (I have tested this and works for me)
- update the book
## Proposed Changes
Two tiny updates to satisfy Clippy 1.68
Plus refactoring of the `http_api` into less complex types so the compiler can chew and digest them more easily.
Co-authored-by: Michael Sproul <michael@sigmaprime.io>