## Overview
There are forked chains which get referenced by blocks and attestations on a network. Typically if these chains are very long, we stop looking up the chain and downvote the peer. In extreme circumstances, many peers are on many chains, the chains can be very deep and become time consuming performing lookups.
This PR adds a cache to known failed chain lookups. This prevents us from starting a parent-lookup (or stopping one half way through) if we have attempted the chain lookup in the past.
The changes are somewhat simple but should solve two issues:
- When quickly changing between chains once and a second time back again, batchIds would collide and cause havoc.
- If we got an out of range response from a peer, sync would remain in syncing but without advancing
Changes:
- remove the batch id. Identify each batch (inside a chain) by its starting epoch. Target epochs for downloading and processing now advance by EPOCHS_PER_BATCH
- for the same reason, move the "to_be_downloaded_id" to be an epoch
- remove a sneaky line that dropped an out of range batch without downloading it
- bonus: put the chain_id in the log given to the chain. This is why explicitly logging the chain_id is removed
## Issue Addressed
There is currently an issue with yamux when connecting to prysm peers. The source of the issue is currently unknown.
This PR removes yamux support to force mplex negotation. We can add back yamux support once we have isolated and corrected the issue.
## Issue Addressed
NA
## Proposed Changes
This PR commits the `Cargo.lock` file so it does not indicate a dirty git tree in the version tag. This code should be used for the `v0.2.3` release.
Also, adds a `Makefile` command to produce tarballs for upload on release.
## Additional Info
NA
## Issue Addressed
N/A
## Proposed Changes
Introduces the `GossipProcessor`, a multi-threaded (multi-tasked?), non-blocking processor for some messages from the network which require verification and import into the `BeaconChain`.
Initial testing indicates that this massively improves system stability by (a) moving block tasks from the normal executor (b) spreading out attestation load.
## Additional Info
TBC
## Issue Addressed
NA
## Proposed Changes
Adds support for using the [`cross`](https://github.com/rust-embedded/cross) project to produce cross-compiled binaries using Docker images.
Provides quite clean and simple cross-compiles cause all the complexity is hidden in Dockerfiles. It does require you to be in the `docker` group though.
## Details
- Adds shortcut commands to `Makefile`
- Ensures `reqwest` and `discv5` use vendored openssl libs (i.e., static not shared).
- Switches to a [commit](284f705964) of blst that has a renamed C function to avoid a collision with openssl (upstream issue: https://github.com/supranational/blst/issues/21).
- Updates `ring` to the latest satisfiable version, since an earlier version was causing issues with `cross`.
- Off-topic, but adds extra message about Windows support as suggested by Discord user.
## Additional Info
- ~~Blocked on #1495~~
- There are no tests in CI for this yet for a few reasons:
- I'm hesitant to add more long-running tasks.
- Short-term bitrot should be avoided since we'll use it each release.
- In the long term I think it would be good to automate binary creation on a release.
- I observed the binaries increase in size from 50mb to 52mb after these changes.
## Issue Addressed
#1483
## Proposed Changes
Upgrades the log to a critical if a listener fails. We are able to listen on many interfaces so a single instance is not critical. We should however gracefully shutdown the client if we have no listeners, although the client can still function solely on outgoing connections.
For now a critical is raised and I leave #1494 for more sophisticated handling of this.
This also updates discv5 to handle errors of binding to a UDP socket such that lighthouse is now able to handle them.
## Issue Addressed
Some nodes not following head, high CPU usage and HTTP API delays
## Proposed Changes
Patches gossipsub. Gossipsub was using an `lru_time_cache` to check for duplicates. This contained an `O(N)` lookup for every gossipsub message to update the time cache. This was causing high cpu usage and blocking network threads.
This PR introduces a custom cache without `O(N)` inserts.
This also adds built in safety mechanisms to prevent gossipsub from excessively retrying connections upon failure. A maximum limit is set after which we disconnect from the node from too many failed substream connections.
## Issue Addressed
NA
## Proposed Changes
- Moves the git-based versioning we were doing into the `lighthouse_version` crate in `common`.
- Removes the `beacon_node/version` crate, replacing it with `lighthouse_version`.
- Bumps the version to `v0.2.0`.
## Additional Info
There are now two types of version string:
1. `const VERSION: &str = Lighthouse/v0.2.0-1419501f2+`
1. `version_with_platform() = Lighthouse/v0.2.0-1419501f2+/x86_64-linux`
(1) is handy cause it's a `const` and shorter. (2) has platform info so it's more useful. Note that the plus-sign (`+`) indicates the the git commit is dirty (it used to be `(modified)` but I had to shorten it to fit into graffiti).
These version strings are now included on:
- `lighthouse --version`
- `lcli --version`
- `curl localhost:5052/node/version`
- p2p messages when we communicate our version
You can update the version by changing this constant (version is not related to a `Cargo.toml`):
b9ad7102d5/common/lighthouse_version/src/lib.rs (L4-L15)
## Issue Addressed
Sync was breaking occasionally. The root cause appears to be identify crashing as events we being sent to the protocol after nodes were banned. Have not been able to reproduce sync issues since this update.
## Proposed Changes
Only send messages to sub-behaviour protocols if the peer manager thinks the peer is connected. All other messages are dropped.
## Proposed Changes
In the continuing war against unportable binaries I figured we should have an option to enable building the Lighthouse binary itself with Milagro. This PR adds a `milagro` feature that can be used with `cargo install --path lighthouse --features milagro --force --locked`. The BLS library in-use will also show up under `lighthouse --version` like this:
```
Lighthouse 0.1.2-7d8acc20a(modified)
BLS Library: milagro
```
Future work: add other cool stuff like the compiler version and CPU target to `--version`.
## Issue Addressed
Closes#1395
## Proposed Changes
* Add a feature to `lighthouse` and `lcli` called `portable` which enables the `portable` feature on our fork of BLST. This feature turns off the `-march=native` C compiler flag that produces binaries highly targeted to the host CPU's instruction set.
* Tweak the `Makefile` so that when the `PORTABLE` environment variable is set to `true`, it compiles with this feature.
* Temporarily enable `PORTABLE=true` in the Docker build so that the image on Docker Hub is portable. Eventually I think we should enable `PORTABLE=true` _only on Docker Hub_, so that users building locally can take advantage of the tasty compiler magic. This seems to be possible by setting a Docker Hub environment variable: https://docs.docker.com/docker-hub/builds/#environment-variables-for-builds
## Additional Info
Tested by compiling on a very new CPU (Intel Core i7-8550U) and copying the binary to a very old CPU (Intel Core i3 530). Before the portability fix, this produced the SIGILL crash described in #1395, and after the fix, it worked smoothly.
I'm in the process of testing the Docker build and running some benches to confirm that the performance penalty isn't too severe.
## Issue Addressed
NA
## Proposed Changes
Allows for multiple "hardcoded" testnets.
## Additional Info
This PR is incomplete.
## TODO
- [x] Add flag to CLI, integrate with rest of Lighthouse.
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
N/A
## Proposed Changes
This provides a number of corrections and improvements to gossipsub. Specifically
- Enables options for greater privacy around the message author
- Provides greater flexibility on message validation
- Prevents unvalidated messages from being gossiped
- Shifts the duplicate cache to a time-based cache inside gossipsub
- Updates the message-id to handle bytes
- Bug fixes related to mesh maintenance and topic subscription. This should improve our attestation inclusion rate.
## Issue Addressed
NA
## Proposed Changes
- Refactor the `bls` crate to support multiple BLS "backends" (e.g., milagro, blst, etc).
- Removes some duplicate, unused code in `common/rest_types/src/validator.rs`.
- Removes the old "upgrade legacy keypairs" functionality (these were unencrypted keys that haven't been supported for a few testnets, no one should be using them anymore).
## Additional Info
Most of the files changed are just inconsequential changes to function names.
## TODO
- [x] Optimization levels
- [x] Infinity point: https://github.com/supranational/blst/issues/11
- [x] Ensure milagro *and* blst are tested via CI
- [x] What to do with unsafe code?
- [x] Test infinity point in signature sets
## Issue Addressed
#1112
The logic is slightly different but still valid wrt to error handling.
- Inbound state is either Busy with a future that return the subtream (and info about the processing)
- The state machine works as follows:
- `Idle` with pending responses => `Busy`
- `Busy` => finished ? if so and there are new pending responses then `Busy`, if not then `Idle`
=> not finished remains `Busy`
- Add an `InboundInfo` for readability
- Other stuff:
- Close inbound substreams when all expected responses are sent
- Remove the error variants from `RPCCodedResponse` and use the codes instead
- Fix various spelling mistakes because I got sloppy last time
Sorry for the delay
Co-authored-by: Age Manning <Age@AgeManning.com>
## Issue Addressed
NA
## Proposed Changes
- Introduces the `valdiator_definitions.yml` file which serves as an explicit list of validators that should be run by the validator client.
- Removes `--strict` flag, split into `--strict-lockfiles` and `--disable-auto-discover`
- Adds a "Validator Management" page to the book.
- Adds the `common/account_utils` crate which contains some logic that was starting to duplicate across the codebase.
The new docs for this feature are the best description of it (apart from the code, I guess): 9cb87e93ce/book/src/validator-management.md
## API Changes
This change should be transparent for *most* existing users. If the `valdiator_definitions.yml` doesn't exist then it will be automatically generated using a method that will detect all the validators in their `validators_dir`.
Users will have issues if they are:
1. Using `--strict`.
1. Have keystores in their `~/.lighthouse/validators` directory that weren't being detected by the current keystore discovery method.
For users with (1), the VC will refuse to start because the `--strict` flag has been removed. They will be forced to review `--help` and choose an equivalent flag.
For users with (2), this seems fairly unlikely and since we're only in testnets there's no *real* value on the line here. I'm happy to take the risk, it would be a different case for mainnet.
## Additional Info
This PR adds functionality we will need for #1347.
## TODO
- [x] Reconsider flags
- [x] Move doc into a more reasonable chapter.
- [x] Check for compile warnings.
## Issue Addressed
https://github.com/sigp/lighthouse/issues/1177
## Proposed Changes
Add a command line option (`--http-allow-origin`) and a config item for configuring the `Access-Control-Allow-Origin` response header. This should unblock making XMLHttpRequests.
Downgrades libp2p and the gossipsub updates.
This looks to resolve the CPU usage issue we have been seeing.
The root cause is likely inside the latest gossipsub updates, which will be addressed in a later PR
* Add latest commit info to git version
* Testing docker build
* Use fallback; modify format
* Revert "Testing docker build"
This reverts commit 197140d6973e22a6b891ebc0f1798d60c87b182a.
* Modify fallback to have crate version
* Process exits and slashings off the network
* Fix rest_api tests
* Add op verification tests
* Add tests for pruning of slashings in the op pool
* Address Paul's review comments