Commit Graph

167 Commits

Author SHA1 Message Date
Age Manning
13cb642f39
Update boot-node and discovery (#1682)
* Improve boot_node and upgrade discovery

* Clippy lints
2020-09-29 18:28:29 +10:00
Paul Hauner
f1180a8947 Prepare for v0.2.12 (#1672)
## Issue Addressed

NA

## Proposed Changes

- Bump versions
- Run cargo update

## Additional Info

NA
2020-09-26 06:35:45 +00:00
Age Manning
28b6d921c6 Remove banned peers from DHT and track IPs (#1656)
## Issue Addressed

#629 

## Proposed Changes

This removes banned peers from the DHT and informs discovery to block the node_id and the known source IP's associated with this node. It has the capabilities of un banning this peer after a period of time. 

This also corrects the logic about banning specific IP addresses. We now use seen_ip addresses from libp2p rather than those sent to us via identify (which also include local addresses).
2020-09-25 01:52:39 +00:00
Michael Sproul
62c8548ed0 Revert "Update BLST, add force-adx support (#1595)" (#1649)
This reverts commit 4fca306397.

Something in the BLST update is causing SIGILLs on aarch64 non-portable builds. While we debug the issue, I think it's best if we just revert the update.
2020-09-23 00:25:56 +00:00
Michael Sproul
4fca306397 Update BLST, add force-adx support (#1595)
## Issue Addressed

Closes #1504
Closes https://github.com/sigp/lighthouse/issues/1505

## Proposed Changes

* Update `blst` to the latest version, which is more portable and includes finer-grained compilation controls (see below).
* Detect the case where a binary has been explicitly compiled with ADX support but it's missing at runtime, and report a nicer error than `SIGILL`.

## Known Issues

* None. The previous issue with `make build-aarch64` (https://github.com/supranational/blst/issues/27), has been resolved.

## Additional Info

I think we should tweak our release process and our Docker builds so that we provide two options:

Binaries:

* `lighthouse`: compiled with `modern`/`force-adx`, for CPUs 2013 and newer
* `lighthouse-portable`: compiled with `portable` for older CPUs

Docker images:

* `sigp/lighthouse:latest`: multi-arch image with `modern` x86_64 and vanilla aarch64 binary
* `sigp/lighthouse:latest-portable`: multi-arch image with `portable` builds for x86_64 and aarch64

And relevant Docker images for the releases (as per https://github.com/sigp/lighthouse/pull/1574#issuecomment-687766141), tagged `v0.x.y` and `v0.x.y-portable`
2020-09-22 05:40:02 +00:00
Paul Hauner
d85d5a435e Bump to v0.2.11 (#1645)
## Issue Addressed

NA

## Proposed Changes

- Bump version to v0.2.11
- Run `cargo update`.


## Additional Info

NA
2020-09-22 04:45:15 +00:00
Michael Sproul
5d17eb899f Update LevelDB to v0.8.6, removing patch (#1636)
Removes our dependency on a fork of LevelDB now that https://github.com/skade/leveldb-sys/pull/17 is merged
2020-09-21 11:53:53 +00:00
Paul Hauner
371e1c1d5d Bump version to v0.2.10 (#1630)
## Issue Addressed

NA

## Proposed Changes

Bump crate version so we can cut a new release with the fix from #1629.

## Additional Info

NA
2020-09-18 06:41:29 +00:00
Daniel Schonfeld
810de2f8b7 Static testnet configs (#1603)
## Issue Addressed

#1431 

## Proposed Changes

Added an archived zip file with required files manually

## Additional Info

1) Used zip, instead of tar.gz to add a single dependency instead of two.
2) I left the download from github code for now, waiting to hear if you'd like it cleaned up or left to be used for some tooling needs.
2020-09-11 01:43:13 +00:00
Age Manning
d79366c503 Prevent printing binary in RPC errors (#1604)
## Issue Addressed

#1566 

## Proposed Changes

Prevents printing binary characters in the RPC error response from peers.
2020-09-10 04:43:22 +00:00
Paul Hauner
0821e6b39f Bump version to v0.2.9 (#1598)
## Issue Addressed

NA

## Proposed Changes

- Bump version tags
- Run `cargo update`

## Additional Info

NA
2020-09-09 02:28:35 +00:00
Michael Sproul
19be7abfd2 Don't quote slot and epoch, for now (#1597)
Fixes a breaking change to our API that was unnecessary and can wait until #1569 is merged
2020-09-08 02:12:36 +00:00
Daniel Schonfeld
2a9a815f29 conforming to the p2p specs, requiring error_messages to be bound (#1593)
## Issue Addressed

#1421 

## Proposed Changes

Bounding the error_message that can be returned for RPC domain errors


Co-authored-by: Age Manning <Age@AgeManning.com>
2020-09-07 06:47:05 +00:00
Age Manning
a6376b4585 Update discv5 to v10 (#1592)
## Issue Addressed

Code improvements, dependency improvements and better async handling.
2020-09-07 05:53:20 +00:00
Michael Sproul
74fa87aa98 Add serde_utils module with quoted u64 support (#1588)
## Proposed Changes

This is an extraction of the quoted int code from #1569, that I've come to rely on for #1544.

It allows us to parse integers from serde strings in YAML, JSON, etc. The main differences from the code in Paul's original PR are:

* Added a submodule that makes quoting mandatory (`require_quotes`).
* Decoding is generic over the type `T` being decoded. You can use `#[serde(with = "serde_utils::quoted_u64::require_quotes")]` on `Epoch` and `Slot` fields (this is what I do in my slashing protection PR).

I've turned on quoting for `Epoch` and `Slot` in this PR, but will leave the other `types` changes to you Paul.

I opted to put everything in the `conseus/serde_utils` module so that BLS can use it without a circular dependency. In future when we want to publish `types` I think we could publish `serde_utils` as `lighthouse_serde_utils` or something. Open to other ideas on this front too.
2020-09-07 01:03:53 +00:00
Sean
638daa87fe Avoid Printing Binary String to Logs (#1576)
Converts the graffiti binary data to string before printing to logs.

## Issue Addressed

#1566 

## Proposed Changes
Rather than converting graffiti to a vector the binary data less the last character is passed to String::from_utf_lossy(). This then allows us to call the to_string() function directly to give us the string

## Additional Info

Rust skills are fairly weak
2020-09-05 05:46:25 +00:00
Age Manning
fb9d828e5e Extended Gossipsub metrics (#1577)
## Issue Addressed

N/A

## Proposed Changes

Adds extended metrics to get a better idea of what is happening at the gossipsub layer of lighthouse. This provides information about mesh statistics per topics, subscriptions and peer scores. 

## Additional Info
2020-09-01 06:59:14 +00:00
Pawan Dhananjay
adea7992f8 Eth1 network exit on wrong network id (#1563)
## Issue Addressed

Fixes #1509 

## Proposed Changes

Exit the beacon node if the eth1 endpoint points to an invalid eth1 network. Check the network id before every eth1 cache update and display an error log if the network id has changed to an invalid one.
2020-08-31 02:36:17 +00:00
blacktemplar
c18d37c202 Use Gossipsub 1.1 (#1516)
## Issue Addressed

#1172

## Proposed Changes

* updates the libp2p dependency
* small adaptions based on changes in libp2p
* report not just valid messages but also invalid and distinguish between `IGNORE`d messages and `REJECT`ed messages


Co-authored-by: Age Manning <Age@AgeManning.com>
2020-08-30 13:06:50 +00:00
Paul Hauner
967700c1ff Bump version to v0.2.8 (#1572)
## Issue Addressed

NA

## Proposed Changes

- Bump versions
- Run `cargo update`

## Additional Info

NA
2020-08-27 07:04:12 +00:00
Adam Szkoda
d9f4819fe0 Alternative (to BeaconChainHarness) BeaconChain testing API (#1380)
The PR:

* Adds the ability to generate a crucial test scenario that isn't possible with `BeaconChainHarness` (i.e. two blocks occupying the same slot; previously forks necessitated skipping slots):

![image](https://user-images.githubusercontent.com/165678/88195404-4bce3580-cc40-11ea-8c08-b48d2e1d5959.png)

* New testing API: Instead of repeatedly calling add_block(), you generate a sorted `Vec<Slot>` and leave it up to the framework to generate blocks at those slots.
* Jumping backwards to an earlier epoch is a hard error, so that tests necessarily generate blocks in a epoch-by-epoch manner.
* Configures the test logger so that output is printed on the console in case a test fails.  The logger also plays well with `--nocapture`, contrary to the existing testing framework
* Rewrites existing fork pruning tests to use the new API
* Adds a tests that triggers finalization at a non epoch boundary slot
* Renamed `BeaconChainYoke` to `BeaconChainTestingRig` because the former has been too confusing
* Fixed multiple tests (e.g. `block_production_different_shuffling_long`, `delete_blocks_and_states`, `shuffling_compatible_simple_fork`) that relied on a weird (and accidental) feature of the old `BeaconChainHarness` that attestations aren't produced for epochs earlier than the current one, thus masking potential bugs in test cases.

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2020-08-26 09:24:55 +00:00
Michael Sproul
4763f03dcc Fix bug in database pruning (#1564)
## Issue Addressed

Closes #1488

## Proposed Changes

* Prevent the pruning algorithm from over-eagerly deleting states at skipped slots when they are shared with the canonical chain.
* Add `debug` logging to the pruning algorithm so we have so better chance of debugging future issues from logs.
* Modify the handling of the "finalized state" in the beacon chain, so that it's always the state at the first slot of the finalized epoch (previously it was the state at the finalized block). This gives database pruning a clearer and cleaner view of things, and will marginally impact the pruning of the op pool, observed proposers, etc (in ways that are safe as far as I can tell).
* Remove duplicated `RevertedFinalizedEpoch` check from `after_finalization`
* Delete useless and unused `max_finality_distance`
* Add tests that exercise pruning with shared states at skip slots
* Delete unnecessary `block_strategy` argument from `add_blocks` and friends in the test harness (will likely conflict with #1380 slightly, sorry @adaszko -- but we can fix that)
* Bonus: add a `BeaconChain::with_head` method. I didn't end up needing it, but it turned out quite nice, so I figured we could keep it?

## Additional Info

Any users who have experienced pruning errors on Medalla will need to resync after upgrading to a release including this change. This should end unbounded `chain_db` growth! 🎉
2020-08-26 00:01:06 +00:00
Paul Hauner
3569506acd Remove rayon from rest_api (#1562)
## Issue Addressed

NA

## Proposed Changes

Addresses a deadlock condition described here: https://hackmd.io/ijQlqOdqSGaWmIo6zMVV-A?view

## Additional Info

NA
2020-08-24 07:28:54 +00:00
Paul Hauner
c895dc8971 Shift HTTP server heavy-lifting to blocking executor (#1518)
## Issue Addressed

NA

## Proposed Changes

Shift practically all HTTP endpoint handlers to the blocking executor (some very light tasks are left on the core executor).

## Additional Info

This PR covers the `rest_api` which will soon be refactored to suit the standard API. As such, I've cut a few corners and left some existing issues open in this patch. What I have done here should leave the API in state that is not necessary *exactly* the same, but good enough for us to run validators with. Specifically, the number of blocking workers that can be spawned is unbounded and I have not implemented a queue; this will need to be fixed when we implement the standard API.
2020-08-24 03:06:10 +00:00
blacktemplar
2bc9115a94 reuse beacon_node methods for initializing network configs in boot_node (#1520)
## Issue Addressed

#1378

## Proposed Changes

Boot node reuses code from beacon_node to initialize network config. This also enables using the network directory to store/load the enr and the private key.

## Additional Info

Note that before this PR the port cli arguments were off (the argument was named `enr-port` but used as `boot-node-enr-port`).
Therefore as port always the cli port argument was used (for both enr and listening). Now the enr-port argument can be used to overwrite the listening port as the public port others should connect to.

Last but not least note, that this restructuring reuses `ethlibp2p::NetworkConfig` that has many more options than the ones used in the boot node. For example the network config has an own `discv5_config` field that gets never used in the boot node and instead another `Discv5Config` gets created later in the boot node process.

Co-authored-by: Age Manning <Age@AgeManning.com>
2020-08-21 12:00:01 +00:00
Paul Hauner
ebb25b5569 Bump version to v0.2.6 (#1549)
## Issue Addressed

NA

## Proposed Changes

See title.

## Additional Info

NA
2020-08-19 09:31:01 +00:00
Age Manning
33b2a3d0e0 Version bump to v0.2.5 (#1540)
## Description

Version bumps lighthouse to v0.2.5
2020-08-18 11:23:08 +00:00
Age Manning
3bb30754d9 Keep track of failed head chains and prevent re-lookups (#1534)
## Overview

There are forked chains which get referenced by blocks and attestations on a network. Typically if these chains are very long, we stop looking up the chain and downvote the peer. In extreme circumstances, many peers are on many chains, the chains can be very deep and become time consuming performing lookups. 

This PR adds a cache to known failed chain lookups. This prevents us from starting a parent-lookup (or stopping one half way through) if we have attempted the chain lookup in the past.
2020-08-18 03:54:09 +00:00
divma
46dbf027af Do not reset batch ids & redownload out of range batches (#1528)
The changes are somewhat simple but should solve two issues:
- When quickly changing between chains once and a second time back again, batchIds would collide and cause havoc. 
- If we got an out of range response from a peer, sync would remain in syncing but without advancing

Changes:
- remove the batch id. Identify each batch (inside a chain) by its starting epoch. Target epochs for downloading and processing now advance by EPOCHS_PER_BATCH
- for the same reason, move the "to_be_downloaded_id" to be an epoch
- remove a sneaky line that dropped an out of range batch without downloading it
- bonus: put the chain_id in the log given to the chain. This is why explicitly logging the chain_id is removed
2020-08-18 01:29:51 +00:00
Paul Hauner
9a97a0b14f Prepare for v0.2.4 (#1533)
## Issue Addressed

NA

## Proposed Changes

NA

## Additional Info

NA
2020-08-17 12:13:42 +00:00
Paul Hauner
a58aa6ee55 Revert back to discv5 alpha 8 to maintain ARM support (#1531)
## Issue Addressed

NA

## Proposed Changes

See title.

## Additional Info

NA
2020-08-17 10:06:08 +00:00
Age Manning
3c689a6837 Remove yamux support (#1526)
## Issue Addressed

There is currently an issue with yamux when connecting to prysm peers. The source of the issue is currently unknown. 

This PR removes yamux support to force mplex negotation. We can add back yamux support once we have isolated and corrected the issue.
2020-08-17 05:05:06 +00:00
Age Manning
afdc4fea1d Correct logic for peer sync identification (#1525)
Fix a small sync bug which can mis-classify newly connected peers.
2020-08-17 03:00:10 +00:00
Paul Hauner
6aeb896480 Commit Cargo.lock changes, add build scripts (#1521)
## Issue Addressed

NA

## Proposed Changes

This PR commits the `Cargo.lock` file so it does not indicate a dirty git tree in the version tag. This code should be used for the `v0.2.3` release.

Also, adds a `Makefile` command to produce tarballs for upload on release.

## Additional Info

NA
2020-08-14 22:24:27 +00:00
Paul Hauner
b0a3731fff Introduce a queue for attestations from the network (#1511)
## Issue Addressed

N/A

## Proposed Changes

Introduces the `GossipProcessor`, a multi-threaded (multi-tasked?), non-blocking processor for some messages from the network which require verification and import into the `BeaconChain`.

Initial testing indicates that this massively improves system stability by (a) moving block tasks from the normal executor (b) spreading out attestation load.

## Additional Info

TBC
2020-08-14 04:38:45 +00:00
Paul Hauner
b063df5bf9 Cross-compile to vendored x86_84, aarch64 (Raspberry Pi 4) (#1497)
## Issue Addressed

NA

## Proposed Changes

Adds support for using the [`cross`](https://github.com/rust-embedded/cross) project to produce cross-compiled binaries using Docker images.

Provides quite clean and simple cross-compiles cause all the complexity is hidden in Dockerfiles. It does require you to be in the `docker` group though.

## Details

- Adds shortcut commands to `Makefile`
- Ensures `reqwest` and `discv5` use vendored openssl libs (i.e., static not shared).
- Switches to a [commit](284f705964) of blst that has a renamed C function to avoid a collision with openssl (upstream issue: https://github.com/supranational/blst/issues/21).
- Updates `ring` to the latest satisfiable version, since an earlier version was causing issues with `cross`.
- Off-topic, but adds extra message about Windows support as suggested by Discord user.

## Additional Info

- ~~Blocked on #1495~~
- There are no tests in CI for this yet for a few reasons:
  - I'm hesitant to add more long-running tasks.
  - Short-term bitrot should be avoided since we'll use it each release.
  - In the long term I think it would be good to automate binary creation on a release.
- I observed the binaries increase in size from 50mb to 52mb after these changes.
2020-08-11 05:16:30 +00:00
Age Manning
134676fd6f Version bump to v0.2.2 (#1496)
Version bump to v0.2.2
2020-08-10 06:49:03 +00:00
Age Manning
cbfae87aa6 Upgrade logs (#1495)
## Issue Addressed

#1483 

## Proposed Changes

Upgrades the log to a critical if a listener fails. We are able to listen on many interfaces so a single instance is not critical. We should however gracefully shutdown the client if we have no listeners, although the client can still function solely on outgoing connections.

For now a critical is raised and I leave #1494 for more sophisticated handling of this. 

This also updates discv5 to handle errors of binding to a UDP socket such that lighthouse is now able to handle them.
2020-08-10 05:19:51 +00:00
Age Manning
04e4389efe Patch gossipsub (#1490)
## Issue Addressed

Some nodes not following head, high CPU usage and HTTP API delays

## Proposed Changes

Patches gossipsub. Gossipsub was using an `lru_time_cache` to check for duplicates. This contained an `O(N)` lookup for every gossipsub message to update the time cache. This was causing high cpu usage and blocking network threads. 

This PR introduces a custom cache without `O(N)` inserts. 

This also adds built in safety mechanisms to prevent gossipsub from excessively retrying connections upon failure. A maximum limit is set after which we disconnect from the node from too many failed substream connections.
2020-08-08 08:09:04 +00:00
Age Manning
a1f9769040 Libp2p update (#1482)
Updates to latest libp2p master. 

This now has native noise support. 

This PR
- Removes secio support
- Prioritises mplex over yamux
2020-08-08 02:17:32 +00:00
Age Manning
09a615b2c0 Lighthouse crate v0.2.0 bump (#1450)
## Description

This PR marks Lighthouse v0.2.0. 

This release marks the stable version of Lighthouse, ready for the approaching Medalla testnet.
2020-08-06 03:43:05 +00:00
Paul Hauner
f26adc0a36 Lighthouse v0.2.0 (Medalla) (#1452)
## Issue Addressed

NA

## Proposed Changes

- Moves the git-based versioning we were doing into the `lighthouse_version` crate in `common`.
- Removes the `beacon_node/version` crate, replacing it with `lighthouse_version`.
- Bumps the version to `v0.2.0`.

## Additional Info

There are now two types of version string:

1. `const VERSION: &str = Lighthouse/v0.2.0-1419501f2+`
1. `version_with_platform() = Lighthouse/v0.2.0-1419501f2+/x86_64-linux`

(1) is handy cause it's a `const` and shorter. (2) has platform info so it's more useful. Note that the plus-sign (`+`) indicates the the git commit is dirty (it used to be `(modified)` but I had to shorten it to fit into graffiti).

These version strings are now included on:

- `lighthouse --version`
- `lcli --version`
- `curl localhost:5052/node/version`
- p2p messages when we communicate our version

You can update the version by changing this constant (version is not related to a `Cargo.toml`):

b9ad7102d5/common/lighthouse_version/src/lib.rs (L4-L15)
2020-08-04 07:44:53 +00:00
Paul Hauner
d4dd25883f Update sigp/blst commit (#1454)
## Issue Addressed

NA

## Proposed Changes

Merges `blst/master` into our `sigp/portable` branch.

## Additional Info

NA
2020-08-04 06:20:09 +00:00
Age Manning
f634f073a8 Correct issue with network message passing (#1439)
## Issue Addressed

Sync was breaking occasionally. The root cause appears to be identify crashing as events we being sent to the protocol after nodes were banned. Have not been able to reproduce sync issues since this update. 

## Proposed Changes

Only send messages to sub-behaviour protocols if the peer manager thinks the peer is connected. All other messages are dropped.
2020-08-03 09:35:53 +00:00
Michael Sproul
3ea01ac26b Add top-level feature to enable Milagro (#1428)
## Proposed Changes

In the continuing war against unportable binaries I figured we should have an option to enable building the Lighthouse binary itself with Milagro. This PR adds a `milagro` feature that can be used with `cargo install --path lighthouse --features milagro --force --locked`. The BLS library in-use will also show up under `lighthouse --version` like this:

```
Lighthouse 0.1.2-7d8acc20a(modified)
BLS Library: milagro
```

Future work: add other cool stuff like the compiler version and CPU target to `--version`.
2020-08-01 05:52:55 +00:00
Michael Sproul
7d8acc20a0 Add a flag to make lighthouse portable across machines (#1423)
## Issue Addressed

Closes #1395

## Proposed Changes

* Add a feature to `lighthouse` and `lcli` called `portable` which enables the `portable` feature on our fork of BLST. This feature turns off the `-march=native` C compiler flag that produces binaries highly targeted to the host CPU's instruction set.
* Tweak the `Makefile` so that when the `PORTABLE` environment variable is set to `true`, it compiles with this feature.
* Temporarily enable `PORTABLE=true` in the Docker build so that the image on Docker Hub is portable. Eventually I think we should enable `PORTABLE=true` _only on Docker Hub_, so that users building locally can take advantage of the tasty compiler magic. This seems to be possible by setting a Docker Hub environment variable: https://docs.docker.com/docker-hub/builds/#environment-variables-for-builds

## Additional Info

Tested by compiling on a very new CPU (Intel Core i7-8550U) and copying the binary to a very old CPU (Intel Core i3 530). Before the portability fix, this produced the SIGILL crash described in #1395, and after the fix, it worked smoothly.

I'm in the process of testing the Docker build and running some benches to confirm that the performance penalty isn't too severe.
2020-07-31 05:00:39 +00:00
Age Manning
a37e75f44b
Downgrade sync and rpc warn logs (#1417)
* Downgrade sycn and rpc warn logs

* Correct warning
2020-07-30 13:52:44 +10:00
Paul Hauner
36d3d37cb4 Add support for multiple testnet flags (#1396)
## Issue Addressed

NA

## Proposed Changes

Allows for multiple "hardcoded" testnets.

## Additional Info

This PR is incomplete.

## TODO

- [x] Add flag to CLI, integrate with rest of Lighthouse.


Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2020-07-29 06:39:29 +00:00
Paul Hauner
eaa9f9744f Add EF launchpad import (#1381)
## Issue Addressed

NA

## Proposed Changes

Adds an integration for keys generated via https://github.com/ethereum/eth2.0-deposit (In reality keys are *actually* generated here: https://github.com/ethereum/eth2.0-deposit-cli).

## Additional Info

NA

## TODO

- [x] Docs
- [x] Tests

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2020-07-29 04:32:50 +00:00
Age Manning
ba0f3daf9d Gossipsub update (#1400)
## Issue Addressed

N/A

## Proposed Changes

This provides a number of corrections and improvements to gossipsub. Specifically
- Enables options for greater privacy around the message author
- Provides greater flexibility on message validation
- Prevents unvalidated messages from being gossiped
- Shifts the duplicate cache to a time-based cache inside gossipsub
- Updates the message-id to handle bytes
- Bug fixes related to mesh maintenance and topic subscription. This should improve our attestation inclusion rate.
2020-07-29 03:40:22 +00:00