Commit Graph

280 Commits

Author SHA1 Message Date
Age Manning
df40700ddd Rename eth2_libp2p to lighthouse_network (#2702)
## Description

The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration. 

As of this writing it currently houses the following high-level lighthouse-specific logic:
- Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings
- Integration and handling of ENRs with respect to libp2p and eth2
- Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers. 
- Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc.
- Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api.
- Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks.
- Lighthouse specific types for managing gossipsub topics, sync status and ENR fields
- Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring
- Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p. 

Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network`

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-10-19 00:30:39 +00:00
Paul Hauner
fff01b24dd Release v2.0.1 (#2726)
## Issue Addressed

NA

## Proposed Changes

- Update versions to `v2.0.1` in anticipation for a release early next week.
- Add `--ignore` to `cargo audit`. See #2727.

## Additional Info

NA
2021-10-18 03:08:32 +00:00
Mac L
a73d698e30 Add TLS capability to the beacon node HTTP API (#2668)
Currently, the beacon node has no ability to serve the HTTP API over TLS.
Adding this functionality would be helpful for certain use cases, such as when you need a validator client to connect to a backup beacon node which is outside your local network, and the use of an SSH tunnel or reverse proxy would be inappropriate.

## Proposed Changes

- Add three new CLI flags to the beacon node
  - `--http-enable-tls`: enables TLS
  - `--http-tls-cert`: to specify the path to the certificate file
  - `--http-tls-key`: to specify the path to the key file
- Update the HTTP API to optionally use `warp`'s [`TlsServer`](https://docs.rs/warp/0.3.1/warp/struct.TlsServer.html) depending on the presence of the `--http-enable-tls` flag
- Update tests and docs
- Use a custom branch for `warp` to ensure proper error handling

## Additional Info

Serving the API over TLS should currently be considered experimental. The reason for this is that it uses code from an [unmerged PR](https://github.com/seanmonstar/warp/pull/717). This commit provides the `try_bind_with_graceful_shutdown` method to `warp`, which is helpful for controlling error flow when the TLS configuration is invalid (cert/key files don't exist, incorrect permissions, etc). 
I've implemented the same code in my [branch here](https://github.com/macladson/warp/tree/tls).

Once the code has been reviewed and merged upstream into `warp`, we can remove the dependency on my branch and the feature can be considered more stable.

Currently, the private key file must not be password-protected in order to be read into Lighthouse.
2021-10-12 03:35:49 +00:00
Michael Sproul
708557a473 Fix cargo audit warns for nix, psutil, time (#2699)
## Issue Addressed

Fix `cargo audit` failures on `unstable`

Closes #2698

## Proposed Changes

The main culprit is `nix`, which is vulnerable for versions below v0.23.0. We can't get by with a straight-forward `cargo update` because `psutil` depends on an old version of `nix` (cf. https://github.com/rust-psutil/rust-psutil/pull/93). Hence I've temporarily forked `psutil` under the `sigp` org, where I've included the update to `nix` v0.23.0.

Additionally, I took the chance to update the `time` dependency to v0.3, which removed a bunch of stale deps including `stdweb` which is no longer maintained. Lighthouse only uses the `time` crate in the notifier to do some pretty printing, and so wasn't affected by any of the breaking changes in v0.3 ([changelog here](https://github.com/time-rs/time/blob/main/CHANGELOG.md#030-2021-07-30)).
2021-10-11 00:10:35 +00:00
Michael Sproul
229542cd6c Avoid negative values in malloc_utils metrics (#2692)
## Proposed Changes

While investigating memory usage I noticed that the malloc metrics were going negative once they passed 2GiB. This is because the underlying `mallinfo` function returns a `i32`, and we were casting it straight to an `i64`, preserving the sign.

The long-term fix will be to move to `mallinfo2`, but it's still not yet widely available.
2021-10-11 00:10:34 +00:00
Wink Saville
58870fc6d3 Add test_logger as feature to logging (#2586)
## Issue Addressed

Fix #2585

## Proposed Changes

Provide a canonical version of test_logger that can be used
throughout lighthouse.

## Additional Info

This allows tests to conditionally emit logging data by adding
test_logger as the default logger. And then when executing
`cargo test --features logging/test_logger` log output
will be visible:

  wink@3900x:~/lighthouse/common/logging/tests/test-feature-test_logger (Add-test_logger-as-feature-to-logging)
  $ cargo test --features logging/test_logger
      Finished test [unoptimized + debuginfo] target(s) in 0.02s
       Running unittests (target/debug/deps/test_logger-e20115db6a5e3714)

  running 1 test
  Sep 10 12:53:45.212 INFO hi, module: test_logger:8
  test tests::test_fn_with_logging ... ok

  test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Doc-tests test-logger

  running 0 tests

  test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Or, in normal scenarios where logging isn't needed, executing
`cargo test` the log output will not be visible:

  wink@3900x:~/lighthouse/common/logging/tests/test-feature-test_logger (Add-test_logger-as-feature-to-logging)
  $ cargo test
      Finished test [unoptimized + debuginfo] target(s) in 0.02s
       Running unittests (target/debug/deps/test_logger-02e02f8d41e8cf8a)

  running 1 test
  test tests::test_fn_with_logging ... ok

  test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Doc-tests test-logger

  running 0 tests

  test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
2021-10-06 00:46:07 +00:00
Michael Sproul
7c88f582d9 Release v2.0.0 (#2673)
## Proposed Changes

* Bump version to v2.0.0
* Update dependencies (obsoletes #2670). `tokio-macros` v1.4.0 had been yanked due to a bug.
2021-10-05 03:53:18 +00:00
Michael Sproul
ea78315749 Release v2.0.0-rc.0 (#2634)
## Proposed Changes

Cut the first release candidate for v2.0.0, in preparation for testing and release this week

## Additional Info

Builds on #2632, which should either be merged first or in the same batch
2021-10-01 01:23:55 +00:00
Squirrel
db4d72c4f1 Remove unused deps (#2592)
Found some deps you're possibly not using.

Please shout if you think they are indeed still needed.
2021-09-30 04:31:42 +00:00
Mac L
4c510f8f6b Add BlockTimesCache to allow additional block delay metrics (#2546)
## Issue Addressed

Closes #2528

## Proposed Changes

- Add `BlockTimesCache` to provide block timing information to `BeaconChain`. This allows additional metrics to be calculated for blocks that are set as head too late.
- Thread the `seen_timestamp` of blocks received from RPC responses (except blocks from syncing) through to the sync manager, similar to what is done for blocks from gossip.

## Additional Info

This provides the following additional metrics:
- `BEACON_BLOCK_OBSERVED_SLOT_START_DELAY_TIME`
  - The delay between the start of the slot and when the block was first observed.
- `BEACON_BLOCK_IMPORTED_OBSERVED_DELAY_TIME`
   - The delay between when the block was first observed and when the block was imported.
- `BEACON_BLOCK_HEAD_IMPORTED_DELAY_TIME`
  - The delay between when the block was imported and when the block was set as head.

The metric `BEACON_BLOCK_IMPORTED_SLOT_START_DELAY_TIME` was removed.

A log is produced when a block is set as head too late, e.g.:
```
Aug 27 03:46:39.006 DEBG Delayed head block                      set_as_head_delay: Some(21.731066ms), imported_delay: Some(119.929934ms), observed_delay: Some(3.864596988s), block_delay: 4.006257988s, slot: 1931331, proposer_index: 24294, block_root: 0x937602c89d3143afa89088a44bdf4b4d0d760dad082abacb229495c048648a9e, service: beacon
```
2021-09-30 04:31:41 +00:00
Michael Sproul
e895074ba9 Activate Altair on mainnet at epoch 74240 (#2632)
## Proposed Changes

Schedule Altair on mainnet for epoch 74240 as per https://github.com/ethereum/consensus-specs/pull/2625

This puts the date for Altair as Wed Oct 27 2021 10:56:23 GMT+0000
2021-09-27 04:22:06 +00:00
realbigsean
113ef74ef6 Add contribution and proof event (#2527)
## Issue Addressed

N/A

## Proposed Changes

Add the new ContributionAndProof event: https://github.com/ethereum/beacon-APIs/pull/158

## Additional Info

N/A

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-09-25 07:53:58 +00:00
Paul Hauner
924a1345b1 Update zeroize_derive (#2625)
## Issue Addressed

NA

## Proposed Changes

As `cargo audit` astutely pointed out, the version of `zeroize_derive` were were using had a vulnerability:

```
Crate:         zeroize_derive
Version:       1.1.0
Title:         `#[zeroize(drop)]` doesn't implement `Drop` for `enum`s
Date:          2021-09-24
ID:            RUSTSEC-2021-0115
URL:           https://rustsec.org/advisories/RUSTSEC-2021-0115
Solution:      Upgrade to >=1.2.0
```

This PR updates `zeroize` and `zeroize_derive` to appease `cargo audit`.

`tiny-bip39` was also updated to allow compile.

## Additional Info

I don't believe this vulnerability actually affected the Lighthouse code-base directly. However, `tiny-bip39` may have been affected which may have resulted in some uncleaned memory in Lighthouse. Whilst this is not ideal, it's not a major issue. Zeroization is a nice-to-have since it only protects from sophisticated attacks or attackers that already have a high level of access already.
2021-09-25 05:58:37 +00:00
Paul Hauner
fe52322088 Implement SSZ union type (#2579)
## Issue Addressed

NA

## Proposed Changes

Implements the "union" type from the SSZ spec for `ssz`, `ssz_derive`, `tree_hash` and `tree_hash_derive` so it may be derived for `enums`:

https://github.com/ethereum/consensus-specs/blob/v1.1.0-beta.3/ssz/simple-serialize.md#union

The union type is required for the merge, since the `Transaction` type is defined as a single-variant union `Union[OpaqueTransaction]`.

### Crate Updates

This PR will (hopefully) cause CI to publish new versions for the following crates:

- `eth2_ssz_derive`: `0.2.1` -> `0.3.0`
- `eth2_ssz`: `0.3.0` -> `0.4.0`
- `eth2_ssz_types`: `0.2.0` -> `0.2.1`
- `tree_hash`: `0.3.0` -> `0.4.0`
- `tree_hash_derive`: `0.3.0` -> `0.4.0`

These these crates depend on each other, I've had to add a workspace-level `[patch]` for these crates. A follow-up PR will need to remove this patch, ones the new versions are published.

### Union Behaviors

We already had SSZ `Encode` and `TreeHash` derive for enums, however it just did a "transparent" pass-through of the inner value. Since the "union" decoding from the spec is in conflict with the transparent method, I've required that all `enum` have exactly one of the following enum-level attributes:

#### SSZ

-  `#[ssz(enum_behaviour = "union")]`
    - matches the spec used for the merge
-  `#[ssz(enum_behaviour = "transparent")]`
    - maintains existing functionality
    - not supported for `Decode` (never was)
    
#### TreeHash

-  `#[tree_hash(enum_behaviour = "union")]`
    - matches the spec used for the merge
-  `#[tree_hash(enum_behaviour = "transparent")]`
    - maintains existing functionality

This means that we can maintain the existing transparent behaviour, but all existing users will get a compile-time error until they explicitly opt-in to being transparent.

### Legacy Option Encoding

Before this PR, we already had a union-esque encoding for `Option<T>`. However, this was with the *old* SSZ spec where the union selector was 4 bytes. During merge specification, the spec was changed to use 1 byte for the selector.

Whilst the 4-byte `Option` encoding was never used in the spec, we used it in our database. Writing a migrate script for all occurrences of `Option` in the database would be painful, especially since it's used in the `CommitteeCache`. To avoid the migrate script, I added a serde-esque `#[ssz(with = "module")]` field-level attribute to `ssz_derive` so that we can opt into the 4-byte encoding on a field-by-field basis.

The `ssz::legacy::four_byte_impl!` macro allows a one-liner to define the module required for the `#[ssz(with = "module")]` for some `Option<T> where T: Encode + Decode`.

Notably, **I have removed `Encode` and `Decode` impls for `Option`**. I've done this to force a break on downstream users. Like I mentioned, `Option` isn't used in the spec so I don't think it'll be *that* annoying. I think it's nicer than quietly having two different union implementations or quietly breaking the existing `Option` impl.

### Crate Publish Ordering

I've modified the order in which CI publishes crates to ensure that we don't publish a crate without ensuring we already published a crate that it depends upon.

## TODO

- [ ] Queue a follow-up `[patch]`-removing PR.
2021-09-25 05:58:36 +00:00
Michael Sproul
9667dc2f03 Implement checkpoint sync (#2244)
## Issue Addressed

Closes #1891
Closes #1784

## Proposed Changes

Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint.

## Additional Info

- [x] Return unavailable status for out-of-range blocks requested by peers (#2561)
- [x] Implement sync daemon for fetching historical blocks (#2561)
- [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module)
- [x] Consistency check for initial block + state
- [x] Fetch the initial state and block from a beacon node HTTP endpoint
- [x] Don't crash fetching beacon states by slot from the API
- [x] Background service for state reconstruction, triggered by CLI flag or API call.

Considered out of scope for this PR:

- Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification)


Co-authored-by: Diva M <divma@protonmail.com>
2021-09-22 00:37:28 +00:00
Age Manning
acdcea9663 Update mainnet bootnodes (#2594)
Sigma Prime is transitioning our mainnet bootnodes and this PR represents the transition of our bootnodes. 

After a few releases, old boot-nodes will be deprecated.
2021-09-16 04:45:07 +00:00
Paul Hauner
c5c7476518 Web3Signer support for VC (#2522)
[EIP-3030]: https://eips.ethereum.org/EIPS/eip-3030
[Web3Signer]: https://consensys.github.io/web3signer/web3signer-eth2.html

## Issue Addressed

Resolves #2498

## Proposed Changes

Allows the VC to call out to a [Web3Signer] remote signer to obtain signatures.


## Additional Info

### Making Signing Functions `async`

To allow remote signing, I needed to make all the signing functions `async`. This caused a bit of noise where I had to convert iterators into `for` loops.

In `duties_service.rs` there was a particularly tricky case where we couldn't hold a write-lock across an `await`, so I had to first take a read-lock, then grab a write-lock.

### Move Signing from Core Executor

Whilst implementing this feature, I noticed that we signing was happening on the core tokio executor. I suspect this was causing the executor to temporarily lock and occasionally trigger some HTTP timeouts (and potentially SQL pool timeouts, but I can't verify this). Since moving all signing into blocking tokio tasks, I noticed a distinct drop in the "atttestations_http_get" metric on a Prater node:

![http_get_times](https://user-images.githubusercontent.com/6660660/132143737-82fd3836-2e7e-445b-a143-cb347783baad.png)

I think this graph indicates that freeing the core executor allows the VC to operate more smoothly.

### Refactor TaskExecutor

I noticed that the `TaskExecutor::spawn_blocking_handle` function would fail to spawn tasks if it were unable to obtain handles to some metrics (this can happen if the same metric is defined twice). It seemed that a more sensible approach would be to keep spawning tasks, but without metrics. To that end, I refactored the function so that it would still function without metrics. There are no other changes made.

## TODO

- [x] Restructure to support multiple signing methods.
- [x] Add calls to remote signer from VC.
- [x] Documentation
- [x] Test all endpoints
- [x] Test HTTPS certificate
- [x] Allow adding remote signer validators via the API
- [x] Add Altair support via [21.8.1-rc1](https://github.com/ConsenSys/web3signer/releases/tag/21.8.1-rc1)
- [x] Create issue to start using latest version of web3signer. (See #2570)

## Notes

- ~~Web3Signer doesn't yet support the Altair fork for Prater. See https://github.com/ConsenSys/web3signer/issues/423.~~
- ~~There is not yet a release of Web3Signer which supports Altair blocks. See https://github.com/ConsenSys/web3signer/issues/391.~~
2021-09-16 03:26:33 +00:00
Michael Sproul
58012f85e1 Shutdown gracefully on panic (#2596)
## Proposed Changes

* Modify the `TaskExecutor` so that it spawns a "monitor" future for each future spawned by `spawn` or `spawn_blocking`. This monitor future joins the handle of the child future and shuts down the executor if it detects a panic.
* Enable backtraces by default by setting the environment variable `RUST_BACKTRACE`.
* Spawn the `ProductionBeaconNode` on the `TaskExecutor` so that if a panic occurs during start-up it will take down the whole process. Previously we were using a raw Tokio `spawn`, but I can't see any reason not to use the executor (perhaps someone else can).

## Additional Info

I considered using [`std::panic::set_hook`](https://doc.rust-lang.org/std/panic/fn.set_hook.html) to instantiate a custom panic handler, however this doesn't allow us to send a shutdown signal because `Fn` functions can't move variables (i.e. the shutdown sender) out of their environment. This also prevents it from receiving a `Logger`.  Hence I decided to leave the panic handler untouched, but with backtraces turned on by default.

I did a run through the code base with all the raw Tokio spawn functions disallowed by Clippy, and found only two instances where we bypass the `TaskExecutor`: the HTTP API and `InitializedValidators` in the VC. In both places we use `spawn_blocking` and handle the return value, so I figured that was OK for now.

In terms of performance I think the overhead should be minimal. The monitor tasks will just get parked by the executor until their child resolves.

I've checked that this covers Discv5, as the `TaskExecutor` gets injected into Discv5 here: f9bba92db3/beacon_node/src/lib.rs (L125-L126)
2021-09-15 00:01:18 +00:00
Paul Hauner
f9bba92db3 v1.5.2 (#2595)
## Issue Addressed

NA

## Proposed Changes

Version bump

## Additional Info

Please do not `bors` without my approval, I am still testing.
2021-09-13 23:01:19 +00:00
Squirrel
e4ed42a9d8 Fix nightly bump num bigint (#2591)
## Issue Addressed

Builds again on latest nightly

## Proposed Changes

Break was caused by: https://github.com/rust-lang/rust/issues/88581
2021-09-12 23:55:20 +00:00
Paul Hauner
ddbd4e6965 v1.5.2-rc.0 (#2565)
## Issue Addressed

NA

## Proposed Changes

- Bump version
- Tidy some comments mangled by the version change regex.

## Additional Info

NA
2021-09-03 23:28:21 +00:00
Michael Sproul
f4aa1d8aea Archive remote_signer code (#2559)
## Proposed Changes

This PR deletes all `remote_signer` code from Lighthouse, for the following reasons:

* The `remote_signer` code is unused, and we have no plans to use it now that we're moving to supporting the Web3Signer APIs: #2522
* It represents a significant maintenance burden. The HTTP API tests have been prone to platform-specific failures, and breakages due to dependency upgrades, e.g. #2400.

Although the code is deleted it remains in the Git history should we ever want to recover it. For ease of reference:

- The last commit containing remote signer code: 5a3bcd2904
- The last Lighthouse version: v1.5.1
2021-09-03 06:09:18 +00:00
Pawan Dhananjay
6f18f95893 Update file permissions (#2499)
## Issue Addressed

Resolves #2438 
Resolves #2437 

## Proposed Changes

Changes the permissions for validator client http server api token file and secret key to 600 from 644. Also changes the permission for logfiles generated using the `--logfile` cli option to 600.

Logs the path to the api token instead of the actual api token. Updates docs to reflect the change.
2021-09-03 02:41:10 +00:00
realbigsean
50321c6671 Updates to make crates publishable (#2472)
## Issue Addressed

Related to: #2259

Made an attempt at all the necessary updates here to publish the crates to crates.io. I incremented the minor versions on all the crates that have been previously published. We still might run into some issues as we try to publish because I'm not able to test this out but I think it's a good starting point.

## Proposed Changes

- Add description and license to `ssz_types` and `serde_util`
- rename `serde_util` to `eth2_serde_util`
- increment minor versions
- remove path dependencies
- remove patch dependencies 

## Additional Info
Crates published: 

- [x] `tree_hash` -- need to publish `tree_hash_derive` and `eth2_hashing` first
- [x] `eth2_ssz_types` -- need to publish `eth2_serde_util` first
- [x] `tree_hash_derive`
- [x] `eth2_ssz`
- [x] `eth2_ssz_derive`
- [x] `eth2_serde_util`
- [x] `eth2_hashing`


Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-09-03 01:10:25 +00:00
Pawan Dhananjay
5a3bcd2904 Validator monitor support for sync committees (#2476)
## Issue Addressed

N/A

## Proposed Changes

Add functionality in the validator monitor to provide sync committee related metrics for monitored validators.


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-08-31 23:31:36 +00:00
Paul Hauner
1031f79aca Improve ergonomics of adding a new network config (#2489)
## Issue Addressed

NA

## Proposed Changes

This PR adds some more fancy macro magic to make it easier to add a new built-in (aka "baked-in") testnet config to the `lighthouse` binary.

Previously, a user needed to modify several files and repeat themselves several times. Now, they only need to add a single definition in the `eth2_config` crate. No repetition 🎉
2021-08-30 23:27:28 +00:00
Pawan Dhananjay
99737c551a Improve eth1 fallback logging (#2490)
## Issue Addressed

Resolves #2487 

## Proposed Changes

Logs a message once in every invocation of `Eth1Service::update` method if the primary endpoint is unavailable for some reason. 

e.g.
```log
Aug 03 00:09:53.517 WARN Error connecting to eth1 node endpoint  action: trying fallbacks, endpoint: http://localhost:8545/, service: eth1_rpc
Aug 03 00:09:56.959 INFO Fetched data from fallback              fallback_number: 1, service: eth1_rpc
```

The main aim of this PR is to have an accompanying message to the "action: trying fallbacks" error message that is returned when checking the endpoint for liveness. This is mainly to indicate to the user that the fallback was live and reachable. 

## Additional info
This PR is not meant to be a catch all for all cases where the primary endpoint failed. For instance, this won't log anything if the primary node was working fine during endpoint liveness checking and failed during deposit/block fetching. This is done intentionally to reduce number of logs while initial deposit/block sync and to avoid more complicated logic.
2021-08-30 00:51:26 +00:00
Paul Hauner
b0ac3464ca v1.5.1 (#2544)
## Issue Addressed

NA

## Proposed Changes

- Bump version

## Additional Info

NA
2021-08-27 01:58:19 +00:00
Michael Sproul
9fb94fbebe Add Altair fork epoch for Prater (#2537)
## Issue Addressed

https://github.com/eth2-clients/eth2-networks/pull/58

## Proposed Changes

Add the fork epoch for Altair on Prater: 36660

## Additional Info

This `config.yaml` is copied exactly from upstream. Large parts already matched due to our preemptive move to the new config style.
2021-08-26 09:42:23 +00:00
Paul Hauner
90d5ab1566 v1.5.0 (#2535)
## Issue Addressed

NA

## Proposed Changes

- Version bump
- Increase queue sizes for aggregated attestations and re-queued attestations. 

## Additional Info

NA
2021-08-23 04:27:36 +00:00
Paul Hauner
12fe72bd37 Always require auth header in VC (#2517)
## Issue Addressed

- Resolves #2512 

## Proposed Changes

Enforces that all routes require an auth token for the VC.

## TODO

- [x] Tests
2021-08-18 01:31:28 +00:00
Paul Hauner
c7379836a5 v1.5.0-rc.1 (#2516)
## Issue Addressed

NA

## Proposed Changes

- Bump version

## Additional Info

NA
2021-08-17 05:34:31 +00:00
Michael Sproul
c0a2f501d9 Upgrade dependencies (#2513)
## Proposed Changes

* Consolidate Tokio versions: everything now uses the latest v1.10.0, no more `tokio-compat`.
* Many semver-compatible changes via `cargo update`. Notably this upgrades from the yanked v0.8.0 version of crossbeam-deque which is present in v1.5.0-rc.0
* Many semver incompatible upgrades via `cargo upgrades` and `cargo upgrade --workspace pkg_name`. Notable ommissions:
    - Prometheus, to be handled separately: https://github.com/sigp/lighthouse/issues/2485
    - `rand`, `rand_xorshift`: the libsecp256k1 package requires 0.7.x, so we'll stick with that for now
    - `ethereum-types` is pinned at 0.11.0 because that's what `web3` is using and it seems nice to have just a single version
    
## Additional Info

We still have two versions of `libp2p-core` due to `discv5` depending on the v0.29.0 release rather than `master`. AFAIK it should be OK to release in this state (cc @AgeManning )
2021-08-17 01:00:24 +00:00
Paul Hauner
4c4ebfbaa1 v1.5.0 rc.0 (#2506)
## Issue Addressed

NA

## Proposed Changes

- Bump to `v1.5.0-rc.0`.
- Increase attestation reprocessing queue size (I saw this filling up on Prater).
- Reduce error log for full attn reprocessing queue to warn.

## TODO

- [x] Manual testing
- [x] Resolve https://github.com/sigp/lighthouse/pull/2493
- [x] Include https://github.com/sigp/lighthouse/pull/2501
2021-08-12 04:02:46 +00:00
Paul Hauner
4af6fcfafd Bump libp2p to address inconsistency in mesh peer tracking (#2493)
## Issue Addressed

- Resolves #2457
- Resolves #2443

## Proposed Changes

Target the (presently unreleased) head of `libp2p/rust-libp2p:master` in order to obtain the fix from https://github.com/libp2p/rust-libp2p/pull/2175.

Additionally:

- `libsecp256k1` needed to be upgraded to satisfy the new version of `libp2p`.
- There were also a handful of minor changes to `eth2_libp2p` to suit some interface changes.
- Two `cargo audit --ignore` flags were remove due to libp2p upgrades.

## Additional Info
 
 NA
2021-08-12 01:59:20 +00:00
Paul Hauner
33ff51a096 Add Altair fork schedule for Pyrmont (#2501)
## Issue Addressed

NA

## Proposed Changes

Adds the Altair fork schedule for Pyrmont, as per https://github.com/eth2-clients/eth2-networks/pull/56 (credits to @ajsutton).

## Additional Info

- I've marked this as `do-not-merge` until the upstream PR is merged.
- I've tagged this for `v1.5.0` because I expect the upstream PR to be merged soon, and I think it would be great if v1.5.0 shipped fully ready for the Pyrmont fork.
2021-08-11 06:17:25 +00:00
Michael Sproul
17a2c778e3 Altair validator client and HTTP API (#2404)
## Proposed Changes

* Implement the validator client and HTTP API changes necessary to support Altair


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-08-06 00:47:31 +00:00
Pawan Dhananjay
e8c0d1f19b Altair networking (#2300)
## Issue Addressed

Resolves #2278 

## Proposed Changes

Implements the networking components for the Altair hard fork https://github.com/ethereum/eth2.0-specs/blob/dev/specs/altair/p2p-interface.md

## Additional Info

This PR acts as the base branch for networking changes and tracks https://github.com/sigp/lighthouse/pull/2279 . Changes to gossip, rpc and discovery can be separate PRs to be merged here for ease of review.

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-08-04 01:44:57 +00:00
realbigsean
c5786a8821 Doppelganger detection (#2230)
## Issue Addressed

Resolves #2069 

## Proposed Changes

- Adds a `--doppelganger-detection` flag
- Adds a `lighthouse/seen_validators` endpoint, which will make it so the lighthouse VC is not interopable with other client beacon nodes if the `--doppelganger-detection` flag is used, but hopefully this will become standardized. Relevant Eth2 API repo issue: https://github.com/ethereum/eth2.0-APIs/issues/64
- If the `--doppelganger-detection` flag is used, the VC will wait until the beacon node is synced, and then wait an additional 2 epochs. The reason for this is to make sure the beacon node is able to subscribe to the subnets our validators should be attesting on. I think an alternative would be to have the beacon node subscribe to all subnets for 2+ epochs on startup by default.

## Additional Info

I'd like to add tests and would appreciate feedback. 

TODO:  handle validators started via the API, potentially make this default behavior

Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-07-31 03:50:52 +00:00
realbigsean
303deb9969 Rust 1.54.0 lints (#2483)
## Issue Addressed

N/A

## Proposed Changes

- Removing a bunch of unnecessary references
- Updated `Error::VariantError` to `Error::Variant`
- There were additional enum variant lints that I ignored, because I thought our variant names were fine
- removed `MonitoredValidator`'s `pubkey` field, because I couldn't find it used anywhere. It looks like we just use the string version of the pubkey (the `id` field) if there is no index

## Additional Info



Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-30 01:11:47 +00:00
Paul Hauner
6e3ca48cb9 Cache participating indices for Altair epoch processing (#2416)
## Issue Addressed

NA

## Proposed Changes

This PR addresses two things:

1. Allows the `ValidatorMonitor` to work with Altair states.
1. Optimizes `altair::process_epoch` (see [code](https://github.com/paulhauner/lighthouse/blob/participation-cache/consensus/state_processing/src/per_epoch_processing/altair/participation_cache.rs) for description)

## Breaking Changes

The breaking changes in this PR revolve around one premise:

*After the Altair fork, it's not longer possible (given only a `BeaconState`) to identify if a validator had *any* attestation included during some epoch. The best we can do is see if that validator made the "timely" source/target/head flags.*

Whilst this seems annoying, it's not actually too bad. Finalization is based upon "timely target" attestations, so that's really the most important thing. Although there's *some* value in knowing if a validator had *any* attestation included, it's far more important to know about "timely target" participation, since this is what affects finality and justification.

For simplicity and consistency, I've also removed the ability to determine if *any* attestation was included from metrics and API endpoints. Now, all Altair and non-Altair states will simply report on the head/target attestations.

The following section details where we've removed fields and provides replacement values.

### Breaking Changes: Prometheus Metrics

Some participation metrics have been removed and replaced. Some were removed since they are no longer relevant to Altair (e.g., total attesting balance) and others replaced with gwei values instead of pre-computed values. This provides more flexibility at display-time (e.g., Grafana).

The following metrics were added as replacements:

- `beacon_participation_prev_epoch_head_attesting_gwei_total`
- `beacon_participation_prev_epoch_target_attesting_gwei_total`
- `beacon_participation_prev_epoch_source_attesting_gwei_total`
- `beacon_participation_prev_epoch_active_gwei_total`

The following metrics were removed:

- `beacon_participation_prev_epoch_attester`
   - instead use `beacon_participation_prev_epoch_source_attesting_gwei_total / beacon_participation_prev_epoch_active_gwei_total`.
- `beacon_participation_prev_epoch_target_attester`
   - instead use `beacon_participation_prev_epoch_target_attesting_gwei_total / beacon_participation_prev_epoch_active_gwei_total`.
- `beacon_participation_prev_epoch_head_attester`
   - instead use `beacon_participation_prev_epoch_head_attesting_gwei_total / beacon_participation_prev_epoch_active_gwei_total`.

The `beacon_participation_prev_epoch_attester` endpoint has been removed. Users should instead use the pre-existing `beacon_participation_prev_epoch_target_attester`. 

### Breaking Changes: HTTP API

The `/lighthouse/validator_inclusion/{epoch}/{validator_id}` endpoint loses the following fields:

- `current_epoch_attesting_gwei` (use `current_epoch_target_attesting_gwei` instead)
- `previous_epoch_attesting_gwei` (use `previous_epoch_target_attesting_gwei` instead)

The `/lighthouse/validator_inclusion/{epoch}/{validator_id}` endpoint lose the following fields:

- `is_current_epoch_attester` (use `is_current_epoch_target_attester` instead)
- `is_previous_epoch_attester` (use `is_previous_epoch_target_attester` instead)
- `is_active_in_current_epoch` becomes `is_active_unslashed_in_current_epoch`.
- `is_active_in_previous_epoch` becomes `is_active_unslashed_in_previous_epoch`.

## Additional Info

NA

## TODO

- [x] Deal with total balances
- [x] Update validator_inclusion API
- [ ] Ensure `beacon_participation_prev_epoch_target_attester` and `beacon_participation_prev_epoch_head_attester` work before Altair

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-27 07:01:01 +00:00
Age Manning
c1d2e35c9e
Bleeding edge discovery (#2435)
* Update discovery banning logic and tokio

* Update to latest discovery

* Shift to latest discovery

* Fmt
2021-07-15 16:43:17 +10:00
Age Manning
4aa06c9555
Network upgrades (#2345) 2021-07-15 16:43:10 +10:00
realbigsean
a3a7f39b0d [Altair] Sync committee pools (#2321)
Add pools supporting sync committees:
- naive sync aggregation pool
- observed sync contributions pool
- observed sync contributors pool
- observed sync aggregators pool

Add SSZ types and tests related to sync committee signatures.

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-15 00:52:02 +00:00
Michael Sproul
8fa6e463ca Update direct libsecp256k1 dependencies (#2456)
## Proposed Changes

* Remove direct dependencies on vulnerable `libsecp256k1 0.3.5`
* Ignore the RUSTSEC issue until it is resolved in #2389
2021-07-14 05:24:10 +00:00
Mac L
b3c7e59a5b Adjust beacon node timeouts for validator client HTTP requests (#2352)
## Issue Addressed

Resolves #2313 

## Proposed Changes

Provide `BeaconNodeHttpClient` with a dedicated `Timeouts` struct.
This will allow granular adjustment of the timeout duration for different calls made from the VC to the BN. These can either be a constant value, or as a ratio of the slot duration.

Improve timeout performance by using these adjusted timeout duration's only whenever a fallback endpoint is available.

Add a CLI flag called `use-long-timeouts` to revert to the old behavior.

## Additional Info

Additionally set the default `BeaconNodeHttpClient` timeouts to the be the slot duration of the network, rather than a constant 12 seconds. This will allow it to adjust to different network specifications.


Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-07-12 01:47:48 +00:00
Michael Sproul
b4689e20c6 Altair consensus changes and refactors (#2279)
## Proposed Changes

Implement the consensus changes necessary for the upcoming Altair hard fork.

## Additional Info

This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`.


Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-07-09 06:15:32 +00:00
Paul Hauner
78e5c0c157 Capture a missed VC error (#2436)
## Issue Addressed

Related to #2430, #2394

## Proposed Changes

As per https://github.com/sigp/lighthouse/issues/2430#issuecomment-875323615, ensure that the `ProductionValidatorClient::new` error raises a log and shuts down the VC. Also, I implemened `spawn_ignoring_error`, as per @michaelsproul's suggestion in https://github.com/sigp/lighthouse/pull/2436#issuecomment-876084419.

I got unlucky and CI picked up a [new rustsec vuln](https://rustsec.org/advisories/RUSTSEC-2021-0072). To fix this, I had to update the following crates:

- `tokio`
- `web3`
- `tokio-compat-02`

## Additional Info

NA
2021-07-09 03:20:24 +00:00
Age Manning
73d002ef92 Update outdated dependencies (#2425)
This updates some older dependencies to address a few cargo audit warnings.

The majority of warnings come from network dependencies which will be addressed in #2389. 

This PR contains some minor dep updates that are not network related.

Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-07-05 00:54:17 +00:00
Michael Sproul
379664a648 Improve compilation error on 32-bit (#2424)
## Issue Addressed

Closes #1661

## Proposed Changes

Add a dummy package called `target_check` which gets compiled early in the build and fails if the target is 32-bit

## Additional Info

You can test the efficacy of this check with:

```
cross build --release --manifest-path lighthouse/Cargo.toml --target i686-unknown-linux-gnu
```

In which case this compilation error is shown:

```
error: Lighthouse requires a 64-bit CPU and operating system
  --> common/target_check/src/lib.rs:8:1
   |
8  | / assert_cfg!(
9  | |     target_pointer_width = "64",
10 | |     "Lighthouse requires a 64-bit CPU and operating system",
11 | | );
   | |__^
```
2021-06-30 04:56:22 +00:00
realbigsean
b84ff9f793 rust 1.53.0 updates (#2411)
## Issue Addressed

`make lint` failing on rust 1.53.0.

## Proposed Changes

1.53.0 updates

## Additional Info

I haven't figure out why yet, we were now hitting the recursion limit in a few crates. So I had to add `#![recursion_limit = "256"]` in a few places


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-06-18 05:58:01 +00:00
realbigsean
b1657a60e9 Reorg events (#2090)
## Issue Addressed

Resolves #2088

## Proposed Changes

Add the `chain_reorg` SSE event topic

## Additional Info


Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-06-17 02:10:46 +00:00
Paul Hauner
3b600acdc5 v1.4.0 (#2402)
## Issue Addressed

NA

## Proposed Changes

- Bump versions and update `Cargo.lock`

## Additional Info

NA

## TODO

- [x] Ensure #2398 gets merged succesfully
2021-06-10 01:44:49 +00:00
Paul Hauner
b383836418 Modify Malloc Tuning (#2398)
## Issue Addressed

NA

## Proposed Changes

I've noticed some of the SigP Prater nodes struggling on v1.4.0-rc.0. I suspect this is due to the changes in #2296. Specifically, the trade-off which lowered the memory footprint whilst increasing runtime on some functions.

Presently, this PR is documenting my testing on Prater.

## Additional Info

NA
2021-06-09 02:30:06 +00:00
Paul Hauner
f6280aa663 v1.4.0-rc.0 (#2379)
## Issue Addressed

NA

## Proposed Changes

Bump versions.

## Additional Info

This is not exactly the v1.4.0 release described in [Lighthouse Update #36](https://lighthouse.sigmaprime.io/update-36.html).

Whilst it contains:

- Beta Windows support
- A reduction in Eth1 queries
- A reduction in memory footprint

It does not contain:

- Altair
- Doppelganger Protection
- The remote signer

We have decided to release some features early. This is primarily due to the desire to allow users to benefit from the memory saving improvements as soon as possible.

## TODO

- [x] Wait for #2340, #2356 and #2376 to merge and then rebase on `unstable`. 
- [x] Ensure discovery issues are fixed (see #2388)
- [x] Ensure https://github.com/sigp/lighthouse/pull/2382 is merged/removed.
- [x] Ensure https://github.com/sigp/lighthouse/pull/2383 is merged/removed.
- [x] Ensure https://github.com/sigp/lighthouse/pull/2384 is merged/removed.
- [ ] Double-check eth1 cache is carried between boots
2021-06-03 00:13:02 +00:00
Paul Hauner
90ea075c62 Revert "Network protocol upgrades (#2345)" (#2388)
## Issue Addressed

NA

## Proposed Changes

Reverts #2345 in the interests of getting v1.4.0 out this week. Once we have released that, we can go back to testing this again.

## Additional Info

NA
2021-06-02 01:07:28 +00:00
Mac L
0847986936 Reduce outbound requests to eth1 endpoints (#2340)
## Issue Addressed

#2282 

## Proposed Changes

Reduce the outbound requests made to eth1 endpoints by caching the results from `eth_chainId` and `net_version`.
Further reduce the overall request count by increasing `auto_update_interval_millis` from `7_000` (7 seconds) to `60_000` (1 minute). 
This will result in a reduction from ~2000 requests per hour to 360 requests per hour (during normal operation). A reduction of 82%.

## Additional Info

If an endpoint fails, its state is dropped from the cache and the `eth_chainId` and `net_version` calls will be made for that endpoint again during the regular update cycle (once per minute) until it is back online.


Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-05-31 04:18:18 +00:00
Age Manning
d12e746b50 Network protocol upgrades (#2345)
This provides a number of upgrades to gossipsub and discovery. 

The updates are extensive and this needs thorough testing.
2021-05-28 22:02:10 +00:00
Paul Hauner
456b313665 Tune GNU malloc (#2299)
## Issue Addressed

NA

## Proposed Changes

Modify the configuration of [GNU malloc](https://www.gnu.org/software/libc/manual/html_node/The-GNU-Allocator.html) to reduce memory footprint.

- Set `M_ARENA_MAX` to 4.
    - This reduces memory fragmentation at the cost of contention between threads.
- Set `M_MMAP_THRESHOLD` to 2mb
    - This means that any allocation >= 2mb is allocated via an anonymous mmap, instead of on the heap/arena. This reduces memory fragmentation since we don't need to keep growing the heap to find big contiguous slabs of free memory.
- ~~Run `malloc_trim` every 60 seconds.~~
    - ~~This shaves unused memory from the top of the heap, preventing the heap from constantly growing.~~
    - Removed, see: https://github.com/sigp/lighthouse/pull/2299#issuecomment-825322646

*Note: this only provides memory savings on the Linux (glibc) platform.*
    
## Additional Info

I'm going to close #2288 in favor of this for the following reasons:

- I've managed to get the memory footprint *smaller* here than with jemalloc.
- This PR seems to be less of a dramatic change than bringing in the jemalloc dep.
- The changes in this PR are strictly runtime changes, so we can create CLI flags which disable them completely. Since this change is wide-reaching and complex, it's nice to have an easy "escape hatch" if there are undesired consequences.

## TODO

- [x] Allow configuration via CLI flags
- [x] Test on Mac
- [x] Test on RasPi.
- [x] Determine if GNU malloc is present?
    - I'm not quite sure how to detect for glibc.. This issue suggests we can't really: https://github.com/rust-lang/rust/issues/33244
- [x] Make a clear argument regarding the affect of this on CPU utilization.
- [x] Test with higher `M_ARENA_MAX` values.
- [x] Test with longer trim intervals
- [x] Add some stats about memory savings
- [x] Remove `malloc_trim` calls & code
2021-05-28 05:59:45 +00:00
Pawan Dhananjay
fdaeec631b Monitoring service api (#2251)
## Issue Addressed

N/A

## Proposed Changes

Adds a client side api for collecting system and process metrics and pushing it to a monitoring service.
2021-05-26 05:58:41 +00:00
ethDreamer
ba55e140ae Enable Compatibility with Windows (#2333)
## Issue Addressed

Windows incompatibility.

## Proposed Changes

On windows, lighthouse needs to default to STDIN as tty doesn't exist. Also Windows uses ACLs for file permissions. So to mirror chmod 600, we will remove every entry in a file's ACL and add only a single SID that is an alias for the file owner.

Beyond that, there were several changes made to different unit tests because windows has slightly different error messages as well as frustrating nuances around killing a process :/

## Additional Info

Tested on my Windows VM and it appears to work, also compiled & tested on Linux with these changes. Permissions look correct on both platforms now. Just waiting for my validator to activate on Prater so I can test running full validator client on windows.

Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
2021-05-19 23:05:16 +00:00
Michael Sproul
58e52f8f40 Write validator definitions atomically (#2338)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/2159

## Proposed Changes

Rather than trying to write the validator definitions to disk directly, use a temporary file called `.validator_defintions.yml.tmp` and then atomically rename it to `validator_definitions.yml`. This avoids truncating the primary file, which can cause permanent damage when the disk is full.

The same treatment is also applied to the validator key cache, although the situation is less dire if it becomes corrupted because it can just be deleted without the user having to reimport keys or resupply passwords.

## Additional Info

* `File::create` truncates upon opening: https://doc.rust-lang.org/std/fs/struct.File.html#method.create
* `fs::rename` uses `rename` on UNIX and `MoveFileEx` on Windows: https://doc.rust-lang.org/std/fs/fn.rename.html
* UNIX `rename` call is atomic: https://unix.stackexchange.com/questions/322038/is-mv-atomic-on-my-fs
* Windows `MoveFileEx` is _not_ atomic in general, and Windows lacks any clear API for atomic file renames :(
   https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows

## Further Work

* Consider whether we want to try a different Windows syscall as part of #2333. The `rust-atomicwrites` crate seems promising, but actually uses the same syscall under the hood presently: https://github.com/untitaker/rust-atomicwrites/issues/27.
2021-05-12 02:04:44 +00:00
Mac L
bacc38c3da Add testing for beacon node and validator client CLI flags (#2311)
## Issue Addressed

N/A

## Proposed Changes

Add unit tests for the various CLI flags associated with the beacon node and validator client. These changes require the addition of two new flags: `dump-config` and `immediate-shutdown`.

## Additional Info

Both `dump-config` and `immediate-shutdown` are marked as hidden since they should only be used in testing and other advanced use cases.
**Note:** This requires changing `main.rs` so that the flags can adjust the program behavior as necessary.

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-05-06 00:36:22 +00:00
Mac L
4cc613d644 Add SensitiveUrl to redact user secrets from endpoints (#2326)
## Issue Addressed

#2276 

## Proposed Changes

Add the `SensitiveUrl` struct which wraps `Url` and implements custom `Display` and `Debug` traits to redact user secrets from being logged in eth1 endpoints, beacon node endpoints and metrics.

## Additional Info

This also includes a small rewrite of the eth1 crate to make requests using `Url` instead of `&str`. 
Some error messages have also been changed to remove `Url` data.
2021-05-04 01:59:51 +00:00
Pascal Bach
c646d2f7a3 Allow specifying alternative url for deposit_contract (#2295)
## Issue Addressed

None

## Proposed Changes

Adds support for downloading the deposit contract from a different location
by setting the environement variables `LIGHTHOUSE_DEPOSIT_CONTRACT_SPEC_URL`
and `LIGHTHOUSE_DEPOSIT_CONTRACT_TESTNET_URL`.

It also adds support to fetch the content from a local file:// URL.

This allows pre fetching to build in an environment without network access.

## Additional Info

Being able to build without network access is required to package the application for https://nixos.org/. But I imagine it might be useful for other distributions too.
2021-04-16 06:47:34 +00:00
Paul Hauner
3a24ca5f14 v1.3.0 (#2310)
## Issue Addressed

NA

## Proposed Changes

Bump versions.

## Additional Info

This is a minor release (not patch) due to the very slight change introduced by #2291.
2021-04-13 22:46:34 +00:00
Paul Hauner
c1203f5e52 Add specific log and metric for delayed blocks (#2308)
## Issue Addressed

NA

## Proposed Changes

- Adds a specific log and metric for when a block is enshrined as head with a delay that will caused bad attestations
    - We *technically* already expose this information, but it's a little tricky to determine during debugging. This makes it nice and explicit.
- Fixes a minor reporting bug with the validator monitor where it was expecting agg. attestations too early (at half-slot rather than two-thirds-slot).

## Additional Info

NA
2021-04-13 02:16:59 +00:00
Paul Hauner
9eb1945136 v1.2.2 (#2287)
## Issue Addressed

NA

## Proposed Changes

- Bump versions

## Additional Info

NA
2021-03-30 04:07:03 +00:00
Michael Sproul
f9d60f5436 VC: accept unknown fields in chain spec (#2277)
## Issue Addressed

Closes #2274

## Proposed Changes

* Modify the `YamlConfig` to collect unknown fields into an `extra_fields` map, instead of failing hard.
* Log a debug message if there are extra fields returned to the VC from one of its BNs.

This restores Lighthouse's compatibility with Teku beacon nodes (and therefore Infura)
2021-03-26 04:53:57 +00:00
Paul Hauner
b34a79dc0b v1.2.1 (#2263)
## Issue Addressed

NA

## Proposed Changes

- Bump version.
- Add some new ENR for Prater
    - Afri: https://github.com/eth2-clients/eth2-testnets/pull/42
    - Prysm: https://github.com/eth2-clients/eth2-testnets/pull/43
- Apply the fixes from #2181 to the no-eth1-sim to try fix CI issues. 

## Additional Info

NA
2021-03-18 04:20:46 +00:00
realbigsean
6a69b20be1 Validator import password flag (#2228)
## Issue Addressed

#2224

## Proposed Changes

Add a `--password-file` option to the `lighthouse account validator import` command. The flag requires `--reuse-password` and will copy the password over to the `validator_definitions.yml` file. I used #2070 as a guide for validating the password as UTF-8 and stripping newlines.

## Additional Info



Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-03-17 05:09:56 +00:00
Pawan Dhananjay
87825b2bd2 Add prater testnet config (#2260)
## Issue Addressed

Resolves #2258 

## Proposed Changes

Add support for prater testnet.
2021-03-17 00:47:06 +00:00
Michael Sproul
3919737978 Release v1.2.0 (#2249)
## Proposed Changes

Release v1.2.0 unchanged from the release candidate.
2021-03-10 01:28:32 +00:00
Michael Sproul
786e25ea08 Release candidate v1.2.0-rc.0 (#2248)
Prepare for v1.2.0 with this release candidate.

To be merged after #2247 and #2246

Co-authored-by: Age Manning <Age@AgeManning.com>
2021-03-08 06:27:50 +00:00
Pawan Dhananjay
da8791abd7 Set graffiti per validator (#2044)
## Issue Addressed

Resolves #1944 

## Proposed Changes

Adds a "graffiti" key to the `validator_definitions.yml`. Setting the key will override anything passed through the validator `--graffiti` flag. 
Returns an error if the value for the graffiti key is > 32 bytes instead of silently truncating.
2021-03-02 22:35:46 +00:00
Age Manning
1c507c588e Update to the latest libp2p (#2239)
Updates to the latest libp2p and ignores RUSTSEC-2020-0146 from cargo-audit


Co-authored-by: Michael Sproul <michael@sigmaprime.io>
2021-03-02 05:59:49 +00:00
Michael Sproul
0b2ccecbcf Make lighthouse_version compatible with old Git (#2223)
## Proposed Changes

When building the release binaries with Cross, Ubuntu 16.04 is used, which uses an old verison of Git lacking support for `--exclude`. This PR changes `lighthouse_version` to use `--match` instead.
2021-02-24 23:51:05 +00:00
Michael Sproul
2f077b11fe Allow HTTP API to return SSZ blocks (#2209)
## Issue Addressed

Implements https://github.com/ethereum/eth2.0-APIs/pull/125

## Proposed Changes

Optionally return SSZ bytes from the `beacon/blocks` endpoint.
2021-02-24 04:15:14 +00:00
realbigsean
5bc93869c8 Update ValidatorStatus to match the v1 API (#2149)
## Issue Addressed

N/A

## Proposed Changes

We are currently a bit off of the standard API spec because we have [this](https://hackmd.io/bQxMDRt1RbS1TLno8K4NPg?view) proposal implemented for validator status.  Based on discussion [here](https://github.com/ethereum/eth2.0-APIs/pull/94), it looks like this won't be added to the spec until v2, so this PR implements [this](https://hackmd.io/ofFJ5gOmQpu1jjHilHbdQQ) validator status logic instead

## Additional Info

N/A


Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-02-24 04:15:13 +00:00
Paul Hauner
a764c3b247 Handle early blocks (#2155)
## Issue Addressed

NA

## Problem this PR addresses

There's an issue where Lighthouse is banning a lot of peers due to the following sequence of events:

1. Gossip block 0xabc arrives ~200ms early
    - It is propagated across the network, with respect to [`MAXIMUM_GOSSIP_CLOCK_DISPARITY`](https://github.com/ethereum/eth2.0-specs/blob/v1.0.0/specs/phase0/p2p-interface.md#why-is-there-maximum_gossip_clock_disparity-when-validating-slot-ranges-of-messages-in-gossip-subnets).
    - However, it is not imported to our database since the block is early.
2. Attestations for 0xabc arrive, but the block was not imported.
    - The peer that sent the attestation is down-voted.
        - Each unknown-block attestation causes a score loss of 1, the peer is banned at -100.
        - When the peer is on an attestation subnet there can be hundreds of attestations, so the peer is banned quickly (before the missed block can be obtained via rpc).

## Potential solutions

I can think of three solutions to this:

1. Wait for attestation-queuing (#635) to arrive and solve this.
    - Easy
    - Not immediate fix.
    - Whilst this would work, I don't think it's a perfect solution for this particular issue, rather (3) is better.
1. Allow importing blocks with a tolerance of `MAXIMUM_GOSSIP_CLOCK_DISPARITY`.
    - Easy
    - ~~I have implemented this, for now.~~
1. If a block is verified for gossip propagation (i.e., signature verified) and it's within `MAXIMUM_GOSSIP_CLOCK_DISPARITY`, then queue it to be processed at the start of the appropriate slot.
    - More difficult
    - Feels like the best solution, I will try to implement this.
    
    
**This PR takes approach (3).**

## Changes included

- Implement the `block_delay_queue`, based upon a [`DelayQueue`](https://docs.rs/tokio-util/0.6.3/tokio_util/time/delay_queue/struct.DelayQueue.html) which can store blocks until it's time to import them.
- Add a new `DelayedImportBlock` variant to the `beacon_processor::WorkEvent` enum to handle this new event.
- In the `BeaconProcessor`, refactor a `tokio::select!` to a struct with an explicit `Stream` implementation. I experienced some issues with `tokio::select!` in the block delay queue and I also found it hard to debug. I think this explicit implementation is nicer and functionally equivalent (apart from the fact that `tokio::select!` randomly chooses futures to poll, whereas now we're deterministic).
- Add a testing framework to the `beacon_processor` module that tests this new block delay logic. I also tested a handful of other operations in the beacon processor (attns, slashings, exits) since it was super easy to copy-pasta the code from the `http_api` tester.
    - To implement these tests I added the concept of an optional `work_journal_tx` to the `BeaconProcessor` which will spit out a log of events. I used this in the tests to ensure that things were happening as I expect.
    - The tests are a little racey, but it's hard to avoid that when testing timing-based code. If we see CI failures I can revise. I haven't observed *any* failures due to races on my machine or on CI yet.
    - To assist with testing I allowed for directly setting the time on the `ManualSlotClock`.
- I gave the `beacon_processor::Worker` a `Toolbox` for two reasons; (a) it avoids changing tons of function sigs when you want to pass a new object to the worker and (b) it seemed cute.
2021-02-24 03:08:52 +00:00
Michael Sproul
399d073ab4 Fix lighthouse_version (#2221)
## Proposed Changes

Somehow since Lighthouse v1.1.3 the behaviour of `git-describe` has changed so that it includes the version tag, the number of commits since that tag, _and_ the commit. According to the docs this is how it should always have behaved?? Weird!

https://git-scm.com/docs/git-describe/2.30.1

Anyway, this lead to `lighthouse_version` producing this monstrosity of a version string when building #2194:

```
Lighthouse/v1.1.3-v1.1.3-5-gac07
```

Observe it in the wild here: https://pyrmont.beaconcha.in/block/694880

Adding `--exclude="*"` prevents `git-describe` from trying to include the tag, and on that troublesome commit from #2194 it now produces the correct version string.
2021-02-23 23:31:37 +00:00
Paul Hauner
46920a84e8 v1.1.3 (#2217)
## Issue Addressed

NA

## Proposed Changes

Bump versions

## Additional Info

NA
2021-02-22 06:21:38 +00:00
Paul Hauner
8c6537e71d v1.1.2 (#2213)
## Issue Addressed

NA

## Proposed Changes

Bump versions

## Additional Info

NA
2021-02-19 00:49:32 +00:00
Paul Hauner
f8cc82f2b1 Switch back to warp with cors wildcard support (#2211)
## Issue Addressed

- Resolves #2204
- Resolves #2205

## Proposed Changes

Switches to my fork of `warp` which contains support for cors wildcards: https://github.com/paulhauner/warp/tree/cors-wildcard

I have a PR open on the `warp` repo but it hasn't had any interest from the maintainers as of yet: https://github.com/seanmonstar/warp/pull/726. I think running from a fork is the best we can do for now.

## Additional Info

NA
2021-02-18 22:33:12 +00:00
Paul Hauner
f819ba5414 v1.1.1 (#2202)
## Issue Addressed

NA

## Proposed Changes

Bump versions
2021-02-16 00:09:02 +00:00
Age Manning
9ae92aa256 Update bootnode ENRs (#2191)
Updates the mainnet boot-node ENRs to the current version

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2021-02-15 06:09:52 +00:00
Pawan Dhananjay
6e6e9104f5 Prevent adding duplicate validators to validator_definitions.yml (#2166)
## Issue Addressed

N/A

## Proposed Changes

This is mostly a UX improvement.

Currently, when recursively finding keystores, we only ignore keystores with same path.This leads to potential issues while copying datadirs (e.g. copying datadir to a new ssd with more storage). After copying new datadir and starting the vc, we will  discover the copied keystores as new keystores and add it to the definitions file leading to duplicate entries.

This PR avoids duplicate keystores being discovered as new keystore by checking for duplicate pubkeys as well.
2021-02-15 06:09:51 +00:00
realbigsean
e20f64b21a Update to tokio 1.1 (#2172)
## Issue Addressed

resolves #2129
resolves #2099 
addresses some of #1712
unblocks #2076
unblocks #2153 

## Proposed Changes

- Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR. 

- Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1.

- We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. **Edit:** I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue`  --> PR in discv5:  https://github.com/sigp/discv5/pull/58

## Additional Info

tokio 1.0 changes that required some changes in lighthouse:

- `interval.next().await.is_some()` -> `interval.tick().await`
- `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028
- `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350
- stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870
- I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-02-10 23:29:49 +00:00
Paul Hauner
ff35fbb121 Add metrics for beacon block propagation (#2173)
## Issue Addressed

NA

## Proposed Changes

Adds some metrics to track delays regarding:

- LH processing of blocks
- delays receiving blocks from other nodes.

## Additional Info

NA
2021-02-04 05:33:56 +00:00
Akihito Nakano
1a22a096c6 Fix clippy errors on tests (#2160)
## Issue Addressed

There are some clippy error on tests.


## Proposed Changes

Enable clippy check on tests and fix the errors. 💪
2021-01-28 23:31:06 +00:00
Paul Hauner
e4b62139d7 v1.1.0 (#2168)
## Issue Addressed

NA

## Proposed Changes

- Bump version
- ~~Run `cargo update`~~

## Additional Info

NA
2021-01-21 02:37:08 +00:00
Paul Hauner
2b2a358522 Detailed validator monitoring (#2151)
## Issue Addressed

- Resolves #2064

## Proposed Changes

Adds a `ValidatorMonitor` struct which provides additional logging and Grafana metrics for specific validators.

Use `lighthouse bn --validator-monitor` to automatically enable monitoring for any validator that hits the [subnet subscription](https://ethereum.github.io/eth2.0-APIs/#/Validator/prepareBeaconCommitteeSubnet) HTTP API endpoint.

Also, use `lighthouse bn --validator-monitor-pubkeys` to supply a list of validators which will always be monitored.

See the new docs included in this PR for more info.

## TODO

- [x] Track validator balance, `slashed` status, etc.
- [x] ~~Register slashings in current epoch, not offense epoch~~
- [ ] Publish Grafana dashboard, update TODO link in docs
- [x] ~~#2130 is merged into this branch, resolve that~~
2021-01-20 19:19:38 +00:00
Paul Hauner
d9f940613f Represent slots in secs instead of millisecs (#2163)
## Issue Addressed

NA

## Proposed Changes

Copied from #2083, changes the config milliseconds_per_slot to seconds_per_slot to avoid errors when slot duration is not a multiple of a second. To avoid deserializing old serialized data (with milliseconds instead of seconds) the Serialize and Deserialize derive got removed from the Spec struct (isn't currently used anyway).

This PR replaces #2083 for the purpose of fixing a merge conflict without requiring the input of @blacktemplar.

## Additional Info

NA


Co-authored-by: blacktemplar <blacktemplar@a1.net>
2021-01-19 09:39:51 +00:00
realbigsean
7a71977987 Clippy 1.49.0 updates and dht persistence test fix (#2156)
## Issue Addressed

`test_dht_persistence` failing

## Proposed Changes

Bind `NetworkService::start` to an underscore prefixed variable rather than `_`.  `_` was causing it to be dropped immediately

This was failing 5/100 times before this update, but I haven't been able to get it to fail after updating it

Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-01-19 00:34:28 +00:00
Arthur Woimbée
851a4dca3c replace tempdir by tempfile (#2143)
## Issue Addressed

Fixes #2141 
Remove [tempdir](https://docs.rs/tempdir/0.3.7/tempdir/) in favor of [tempfile](https://docs.rs/tempfile/3.1.0/tempfile/).

## Proposed Changes

`tempfile` has a slightly different api that makes creating temp folders with a name prefix a chore (`tempdir::TempDir::new("toto")` => `tempfile::Builder::new().prefix("toto").tempdir()`).

So I removed temp folder name prefix where I deemed it not useful.

Otherwise, the functionality is the same.
2021-01-06 06:36:11 +00:00
realbigsean
588b90157d Ssz state api endpoint (#2111)
## Issue Addressed

Catching up to a recently merged API spec PR: https://github.com/ethereum/eth2.0-APIs/pull/119

## Proposed Changes

- Return an SSZ beacon state on `/eth/v1/debug/beacon/states/{stateId}` when passed this header: `accept: application/octet-stream`.
- requests to this endpoint with no  `accept` header or an `accept` header and a value of `application/json` or `*/*` , or will result in a JSON response

## Additional Info


Co-authored-by: realbigsean <seananderson33@gmail.com>
2021-01-06 03:01:46 +00:00
Paul Hauner
f183af20e3 Version v1.0.6 (#2126)
## Issue Addressed

NA

## Proposed Changes

- Bump versions
- Run `cargo update`

## Additional Info

NA
2020-12-28 23:38:02 +00:00
Paul Hauner
9ed65a64f8 Version v1.0.5 (#2117)
## Issue Addressed

NA

## Proposed Changes

- Bump versions to `v1.0.5`
- Run `cargo update`

## Additional Info

NA
2020-12-23 18:52:48 +00:00
Paul Hauner
a62dc65ca4 BN Fallback v2 (#2080)
## Issue Addressed

- Resolves #1883

## Proposed Changes

This follows on from @blacktemplar's work in #2018.

- Allows the VC to connect to multiple BN for redundancy.
  - Update the simulator so some nodes always need to rely on their fallback.
- Adds some extra deprecation warnings for `--eth1-endpoint`
- Pass `SignatureBytes` as a reference instead of by value.

## Additional Info

NA

Co-authored-by: blacktemplar <blacktemplar@a1.net>
2020-12-18 09:17:03 +00:00
Michael Sproul
da1c5fe69d Delete uncompressed genesis states (#2092)
## Issue Addressed

Replaces #2091

## Proposed Changes

* Delete the uncompressed genesis states from `eth2_network_config` after they were merged accidentally in #2029.
* Tweak the build script to not overwrite `genesis.ssz` on every build, which caused spurious rebuilds.
2020-12-16 03:44:05 +00:00