Commit Graph

3936 Commits

Author SHA1 Message Date
Herman Junge
e004b98eab [Remote signer] Fold signer into Lighthouse repository (#1852)
The remote signer relies on the `types` and `crypto/bls` crates from Lighthouse. Moreover, a number of tests of the remote signer consumption of LH leverages this very signer, making any important update a potential dependency nightmare.

Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-11-06 06:17:11 +00:00
Age Manning
e2ae5010a6 Update libp2p (#1865)
Updates libp2p to the latest version. 

This adds tokio 0.3 support and brings back yamux support. 

This also updates some discv5 configuration parameters for leaner discovery queries
2020-11-06 04:14:14 +00:00
Herman Junge
4c4dad9fb5
Fix fn documentation 2020-11-05 17:53:35 +00:00
Paul Hauner
157e31027a Add warnings for deposits (#1858)
## Issue Addressed

NA

## Proposed Changes

Add some warnings to discourage users to user Lighthouse for mainnet.

## Additional Info

NA
2020-11-04 19:46:42 +00:00
blacktemplar
7e7fad5734 Ignore RPC messages of disconnected peers and remove old peers based on disconnection time (#1854)
## Issue Addressed

NA

## Proposed Changes

Lets the networking behavior ignore messages of peers that are not connected. Furthermore, old peers are not removed from the peerdb based on score anymore but based on the disconnection time.
2020-11-03 23:43:10 +00:00
Age Manning
0a0f4daf9d Prevent errors for stream termination race (#1853)
Prevents an error being propagated on a race condition for RPC stream termination
2020-11-03 10:37:00 +00:00
Paul Hauner
0cde4e285c Bump version to v0.3.3 (#1850)
## Issue Addressed

NA

## Proposed Changes

- Update versions
- Run `cargo update`

## Additional Info

- Blocked on #1846
2020-11-02 23:55:15 +00:00
Michael Sproul
2ff5828310 Downgrade ADX check to a warning (#1846)
## Issue Addressed

Closes #1842

## Proposed Changes

Due to the lies told to us by VPS providers about what CPU features they support, we are forced to check for the availability of CPU features like ADX by just _running code and seeing if it crashes_. The prominent warning should hopefully help users who have truly incompatible CPUs work out what is going on, while not burdening users of cheap VPSs.
2020-11-02 22:35:37 +00:00
Pawan Dhananjay
863ee7c9f2 Update to discv5 bootnodes (#1849)
## Issue Addressed

We seem to have roll backed to old discv5 bootnodes with #1799 because of which fresh nodes with no cached peers cannot find any peers.

## Proposed Changes

Updates `boot_enr.yaml` to discv5.1 bootnodes.
2020-11-02 21:29:43 +00:00
Paul Hauner
7afbaa807e Return eth1-related data via the API (#1797)
## Issue Addressed

- Related to #1691

## Proposed Changes

Adds the following API endpoints:

- `GET lighthouse/eth1/syncing`: status about how synced we are with Eth1.
- `GET lighthouse/eth1/block_cache`: all locally cached eth1 blocks.
- `GET lighthouse/eth1/deposit_cache`: all locally cached eth1 deposits.

Additionally:

- Moves some types from the `beacon_node/eth1` to the `common/eth2` crate, so they can be used in the API without duplication.
- Allow `update_deposit_cache` and `update_block_cache` to take an optional head block number to avoid duplicate requests.

## Additional Info

TBC
2020-11-02 00:37:30 +00:00
divma
6c0c050fbb Tweak head syncing (#1845)
## Issue Addressed

Fixes head syncing

## Proposed Changes

- Get back to statusing peers after removing chain segments and making the peer manager deal with status according to the Sync status, preventing an old known deadlock
- Also a bug where a chain would get removed if the optimistic batch succeeds being empty

## Additional Info

Tested on Medalla and looking good
2020-11-01 23:37:39 +00:00
realbigsean
304793a6ab add quoted serialization util for FixedVector and VariableList (#1794)
## Issue Addressed

This comment: https://github.com/sigp/lighthouse/issues/1776#issuecomment-712349841

## Proposed Changes

- Add quoted serde utils for `FixedVector` and `VariableList`
- Had to remove the dependency that `ssz_types` has on `serde_utils` to avoid a circular dependency.

## Additional Info


Co-authored-by: realbigsean <seananderson33@gmail.com>
2020-10-29 23:25:21 +00:00
Pawan Dhananjay
56f9394141 Add cli option for voluntary exits (#1781)
## Issue Addressed

Resolve #1652 

## Proposed Changes

Adds a cli option for voluntary exits. The flow is similar to prysm's where after entering the password for the validator keystore (or load password from `secrets` if present) the user is given multiple warnings about the operation being irreversible, then redirected to the docs webpage(not added yet) which explains what a voluntary exit is and the consequences of exiting and then prompted to enter a phrase from the docs webpage as a final confirmation. 

Example usage
```
$ lighthouse --testnet zinken account validator exit --validator <validator-pubkey> --beacon-node http://localhost:5052

Running account manager for zinken testnet                                                                                                          
validator-dir path: "..."

Enter the keystore password:  for validator in ...

Password is correct

Publishing a voluntary exit for validator: ...              
WARNING: This is an irreversible operation                                                                                                                    
WARNING: Withdrawing staked eth will not be possible until Eth1/Eth2 merge Please visit [website] to make sure you understand the implications of a voluntary exit.            
                                                                                                                                             
Enter the phrase from the above URL to confirm the voluntary exit:
Exit my validator
Published voluntary exit for validator ...
```

## Additional info

Not sure if we should have batch exits (`--validator all`) option for exiting all the validators in the `validators` directory. I'm slightly leaning towards having only single exits but don't have a strong preference.
2020-10-29 23:25:19 +00:00
Paul Hauner
f64f8246db Only run http_api tests in release (#1827)
## Issue Addressed

NA

## Proposed Changes

As raised by @hermanjunge in a DM, the `http_api` tests have been observed taking 100+ minutes on debug. This PR:

- Moves the `http_api` tests to only run in release.
- Groups some `http_api` tests to reduce test-setup overhead.

## Additional Info

NA
2020-10-29 22:25:20 +00:00
realbigsean
ae0f025375 Beacon state validator id filter (#1803)
## Issue Addressed

Michael's comment here: https://github.com/sigp/lighthouse/issues/1434#issuecomment-708834079
Resolves #1808

## Proposed Changes

- Add query param `id` and `status` to the `validators` endpoint
- Add string serialization and deserialization for `ValidatorStatus`
- Drop `Epoch` from `ValidatorStatus` variants

## Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.
2020-10-29 05:13:04 +00:00
divma
9f45ac2f5e More sync edge cases + prettify range (#1834)
## Issue Addressed
Sync edge case when we get an empty optimistic batch that passes validation and is inside the download buffer. Eventually the chain would reach the batch and treat it as an ugly state. 

## Proposed Changes
- Handle the edge case advancing the chain's target + code clarification
- Some largey changes for readability + ergonomics since rust has try ops
- Better handling of bad batch and chain states
2020-10-29 02:29:24 +00:00
blacktemplar
2bd5b9182f fix unbanning of peers (#1838)
## Issue Addressed

NA

## Proposed Changes

Currently a banned peer will remain banned indefinitely as long as update is called on the score struct regularly. This fixes this bug and the score decay starts after `BANNED_BEFORE_DECAY` seconds after banning.
2020-10-29 01:25:02 +00:00
Michael Sproul
36bd4d87f0 Update to spec v1.0.0-rc.0 and BLSv4 (#1765)
## Issue Addressed

Closes #1504 
Closes #1505
Replaces #1703
Closes #1707

## Proposed Changes

* Update BLST and Milagro to versions compatible with BLSv4 spec
* Update Lighthouse to spec v1.0.0-rc.0, and update EF test vectors
* Use the v1.0.0 constants for `MainnetEthSpec`.
* Rename `InteropEthSpec` -> `V012LegacyEthSpec`
    * Change all constants to suit the mainnet `v0.12.3` specification (i.e., Medalla).
* Deprecate the `--spec` flag for the `lighthouse` binary
    * This value is now obtained from the `config_name` field of the `YamlConfig`.
        * Built in testnet YAML files have been updated.
    * Ignore the `--spec` value, if supplied, log a warning that it will be deprecated
    * `lcli` still has the spec flag, that's fine because it's dev tooling.
* Remove the `E: EthSpec` from `YamlConfig`
    * This means we need to deser the genesis `BeaconState` on-demand, but this is fine.
* Swap the old "minimal", "mainnet" strings over to the new `EthSpecId` enum.
* Always require a `CONFIG_NAME` field in `YamlConfig` (it used to have a default).

## Additional Info

Lots of breaking changes, do not merge! ~~We will likely need a Lighthouse v0.4.0 branch, and possibly a long-term v0.3.0 branch to keep Medalla alive~~.

Co-authored-by: Kirk Baird <baird.k@outlook.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-10-28 22:19:38 +00:00
divma
ad846ad280 Inform peers of requests that exceed the maximum rate limit + log downgrade (#1830)
## Issue Addressed

#1825 

## Proposed Changes

Since we penalize more blocks by range requests that have large steps, it is possible to get requests that will never be processed. We were not informing peers about this requests and also logging CRIT that is no longer relevant. Later we should check if more sophisticated handling for those requests is needed
2020-10-27 11:46:38 +00:00
Paul Hauner
92c8eba8ca Ensure eth1 deposit/chain IDs are used from YamlConfig (#1829)
## Issue Addressed

 NA

## Proposed Changes

Fixes a bug which causes the node to reject valid eth1 nodes.

- Fix core bug: failure to apply `YamlConfig` values to `ChainSpec`.
- Add a test to prevent regression in this specific case.
- Fix an invalid log message

## Additional Info

NA
2020-10-26 03:34:14 +00:00
Paul Hauner
f157d61cc7 Address clippy lints, panic in ssz_derive on overflow (#1714)
## Issue Addressed

NA

## Proposed Changes

- Panic or return error if we overflow `usize` in SSZ decoding/encoding derive macros.
  - I claim that the panics can only be triggered by a faulty type definition in lighthouse, they cannot be triggered externally on a validly defined struct.
- Use `Ordering` instead of some `if` statements, as demanded by clippy.
- Remove some old clippy `allow` that seem to no longer be required.
- Add comments to interesting clippy statements that we're going to continue to ignore.
- Create #1713

## Additional Info

NA
2020-10-25 23:27:39 +00:00
Paul Hauner
eba51f0973 Update testnet configs, change on-disk format (#1799)
## Issue Addressed

- Related to #1691

## Proposed Changes

- Add `DEPOSIT_CHAIN_ID` and `DEPOSIT_NETWORK_ID` to `config.yaml`.
    - Pass the `DEPOSIT_NETWORK_ID` to the `eth1::Service`.
- Remove the unused `MAX_EPOCHS_PER_CROSSLINK` from the `altona` and `medalla` configs (see [spec commit](2befe90032 (diff-efb845ac2ebd4aafbc23df40f47ce25699255064e99d36d0406d0a14ca7953ec))).
- Change from compressing the whole testnet directory, to only compressing the genesis state file. This is the only file we need to compress and *not* compressing the others makes them work nicely with git.
    - We can modify the boot nodes, configs, etc. without incurring an eternal binary-blob cost on our git history.
    - This change is backwards compatible (i.e., non-breaking).

## Additional Info

NA
2020-10-25 22:15:46 +00:00
Age Manning
7453f39d68 Prevent unbanning of disconnected peers (#1822)
## Issue Addressed

Further testing revealed another edge case where we attempt to unban a peer that can be in a disconnected start. Although this causes no real issue, it does log an error to the user. 

This PR adds a check to prevent this edge case and prevents the error being logged to the user.
2020-10-24 05:24:20 +00:00
Age Manning
a3cc1a1e0f Call unban only when necessary (#1821)
This PR prevents a user-facing error. 

It prevents optimistically unbanning a peer and instead checks the state of the peer before requesting the peers state to be unbanned.
2020-10-24 03:24:19 +00:00
blacktemplar
1644289a08 Updates the libp2p to the second newest commit => Allow only one topic per message (#1819)
As @AgeManning mentioned the newest libp2p version had some problems and got downgraded again on lighthouse master. This is an intermediate version that makes no problems and only adds a small change of allowing only one topic per message.
2020-10-24 01:05:37 +00:00
Age Manning
7870b81ade Downgrade libp2p (#1817)
## Description

This downgrades the recent libp2p upgrade. 

There were issues with the RPC which prevented syncing of the chain and this upgrade needs to be further investigated.
2020-10-23 09:33:59 +00:00
Paul Hauner
fa2daa7d6c Update readme, add banner (#1814)
## Issue Addressed

NA

## Proposed Changes

- Update progress timeline
- Remove the qualification that the eth2 spec is "emerging".
- Remove the terminal animation, replace with new banner.

## Additional Info

NA
2020-10-23 04:16:38 +00:00
Age Manning
55eee18ebb Version bump to 0.3.1 (#1813)
## Description

Bumps Lighthouse to version 0.3.1.
2020-10-23 04:16:36 +00:00
Age Manning
64c5899d25 Adds colour help to bn and vc subcommands (#1811)
Adds coloured help to the bn and vc subcommands
2020-10-23 04:16:34 +00:00
Age Manning
2c7f362908 Discovery v5.1 (#1786)
## Overview 

This updates lighthouse to discovery v5.1

Note: This makes lighthouse's discovery not compatible with any previous version. Lighthouse cannot discover peers or send/receive ENR's from any previous version. This is a breaking change. 

This resolves #1605
2020-10-23 04:16:33 +00:00
Age Manning
ae96dab5d2 Increase UPnP logging and decrease batch sizes (#1812)
## Description

This increases the logging of the underlying UPnP tasks to inform the user of UPnP error/success. 

This also decreases the batch syncing size to two epochs per batch.
2020-10-23 03:01:33 +00:00
Age Manning
c49dd94e20 Update to latest libp2p (#1810)
## Description

Updates to the latest libp2p and includes gossipsub updates. 

Of particular note is the limitation of a single topic per gossipsub message.

Co-authored-by: blacktemplar <blacktemplar@a1.net>
2020-10-23 03:01:31 +00:00
Michael Sproul
acd49d988d Implement database temp states to reduce memory usage (#1798)
## Issue Addressed

Closes #800
Closes #1713

## Proposed Changes

Implement the temporary state storage algorithm described in #800. Specifically:

* Add `DBColumn::BeaconStateTemporary`, for storing 0-length temporary marker values.
* Store intermediate states immediately as they are created, marked temporary. Delete the temporary flag if the block is processed successfully.
* Add a garbage collection process to delete leftover temporary states on start-up.
* Bump the database schema version to 2 so that a DB with temporary states can't accidentally be used with older versions of the software. The auto-migration is a no-op, but puts in place some infra that we can use for future migrations (e.g. #1784)

## Additional Info

There are two known race conditions, one potentially causing permanent faults (hopefully rare), and the other insignificant.

### Race 1: Permanent state marked temporary

EDIT: this has been fixed by the addition of a lock around the relevant critical section

There are 2 threads that are trying to store 2 different blocks that share some intermediate states (e.g. they both skip some slots from the current head). Consider this sequence of events:

1. Thread 1 checks if state `s` already exists, and seeing that it doesn't, prepares an atomic commit of `(s, s_temporary_flag)`.
2. Thread 2 does the same, but also gets as far as committing the state txn, finishing the processing of its block, and _deleting_ the temporary flag.
3. Thread 1 is (finally) scheduled again, and marks `s` as temporary with its transaction.
4.
    a) The process is killed, or thread 1's block fails verification and the temp flag is not deleted. This is a permanent failure! Any attempt to load state `s` will fail... hope it isn't on the main chain! Alternatively (4b) happens...
    b) Thread 1 finishes, and re-deletes the temporary flag. In this case the failure is transient, state `s` will disappear temporarily, but will come back once thread 1 finishes running.

I _hope_ that steps 1-3 only happen very rarely, and 4a even more rarely. It's hard to know

This once again begs the question of why we're using LevelDB (#483), when it clearly doesn't care about atomicity! A ham-fisted fix would be to wrap the hot and cold DBs in locks, which would bring us closer to how other DBs handle read-write transactions. E.g. [LMDB only allows one R/W transaction at a time](https://docs.rs/lmdb/0.8.0/lmdb/struct.Environment.html#method.begin_rw_txn).

### Race 2: Temporary state returned from `get_state`

I don't think this race really matters, but in `load_hot_state`, if another thread stores a state between when we call `load_state_temporary_flag` and when we call `load_hot_state_summary`, then we could end up returning that state even though it's only a temporary state. I can't think of any case where this would be relevant, and I suspect if it did come up, it would be safe/recoverable (having data is safer than _not_ having data).

This could be fixed by using a LevelDB read snapshot, but that would require substantial changes to how we read all our values, so I don't think it's worth it right now.
2020-10-23 01:27:51 +00:00
Age Manning
66f0cf4430 Improve peer handling (#1796)
## Issue Addressed

Potentially resolves #1647 and sync stalls. 

## Proposed Changes

The handling of the state of banned peers was inadequate for the complex peerdb data structure. We store a limited number of disconnected and banned peers in the db. We were not tracking intermediate "disconnecting" states and the in some circumstances we were updating the peer state without informing the peerdb. This lead to a number of inconsistencies in the peer state. 

Further, the peer manager could ban a peer changing a peer's state from being connected to banned. In this circumstance, if the peer then disconnected, we didn't inform the application layer, which lead to applications like sync not being informed of a peers disconnection. This could lead to sync stalling and having to require a lighthouse restart. 

Improved handling for peer states and interactions with the peerdb is made in this PR.
2020-10-23 01:27:48 +00:00
Jim McDonald
4298efeb23 Update testnet scripts (#1807)
## Proposed Changes

A couple of minor fixes to the testnet scripts.

First, `clean.sh` only attempts to remove the directory if it exists.  This ensures a good exit code even if the directory is not present.

Second, `setup.sh` uses an updated deposit contract address to match that in the generated spec to allow the chain to start.
2020-10-23 00:18:05 +00:00
Paul Hauner
542f755ac5 Remove eth1 deposit functionality (#1780)
## Issue Addressed

- Resolves #1727

## Proposed Changes

Remove the `lighthouse account validator deposit` command.

It's a shame to let this go, but it's currently lacking any tests and contains significant, un-handled edge-cases (e.g., it will wait forever until the eth1 node gives a tx confirmation and if you ctrl+c it before it finishes it will leave the filesystem in an unknown state with lockfiles lying around)

I don't think we need to make deposit functionality a priority before mainnet, we have bigger fish to fry IMO.

We, will need to revive this functionality before the next testnet, but I think we should make private, non-production tools to handle this for SigP internally.

## Additional Info

Be sure to re-open #1331 if this PR is abandoned.
2020-10-22 07:19:30 +00:00
Paul Hauner
b829257cca Ssz state (#1749)
## Issue Addressed

NA

## Proposed Changes

Adds a `lighthouse/beacon/states/:state_id/ssz` endpoint to allow us to pull the genesis state from the API.

## Additional Info

NA
2020-10-22 06:05:49 +00:00
Michael Sproul
7f73dccebc Refine op pool pruning (#1805)
## Issue Addressed

Closes #1769
Closes #1708

## Proposed Changes

Tweaks the op pool pruning so that the attestation pool is pruned against the wall-clock epoch instead of the finalized state's epoch. This should reduce the unbounded growth that we've seen during periods without finality.

Also fixes up the voluntary exit pruning as raised in #1708.
2020-10-22 04:47:29 +00:00
Paul Hauner
a3704b971e Support pre-flight CORS check (#1772)
## Issue Addressed

- Resolves #1766 

## Proposed Changes

- Use the `warp::filters::cors` filter instead of our work-around.

## Additional Info

It's not trivial to enable/disable `cors` using `warp`, since using `routes.with(cors)` changes the type of `routes`.  This makes it difficult to apply/not apply cors at runtime. My solution has been to *always* use the `warp::filters::cors` wrapper but when cors should be disabled, just pass the HTTP server listen address as the only permissible origin.
2020-10-22 04:47:27 +00:00
realbigsean
a3552a4b70 Node endpoints (#1778)
## Issue Addressed

`node` endpoints in #1434

## Proposed Changes

Implement these:
```
 /eth/v1/node/health
 /eth/v1/node/peers/{peer_id}
 /eth/v1/node/peers
```
- Add an `Option<Enr>` to `PeerInfo`
- Finish implementation of `/eth/v1/node/identity`

## Additional Info
- should update the `peers` endpoints when #1764 is resolved



Co-authored-by: realbigsean <seananderson33@gmail.com>
2020-10-22 02:59:42 +00:00
Daniel Schonfeld
8f86baa48d Optimize attester slashing (#1745)
## Issue Addressed

Closes #1548 

## Proposed Changes

Optimizes attester slashing choice by choosing the ones that cover the most amount of validators slashed, with the highest effective balances 

## Additional Info

Initial pass, need to write a test for it
2020-10-22 01:43:54 +00:00
divma
668513b67e Sync state adjustments (#1804)
check for advanced peers and the state of the chain wrt the clock slot to decide if a chain is or not synced /transitioning to a head sync. Also a fix that prevented getting the right state while syncing heads
2020-10-22 00:26:06 +00:00
Paul Hauner
e1eec7828b Fix error in VC API docs (#1800)
## Issue Addressed

NA

## Proposed Changes

- Ensure the `description` field is included with the output (as per the implementation).

## Additional Info

NA
2020-10-22 00:26:04 +00:00
realbigsean
628891df1d fix genesis state root provided to HTTP server (#1783)
## Issue Addressed

Resolves #1776

## Proposed Changes

The beacon chain builder was using the canonical head's state root for the `genesis_state_root` field.

## Additional Info
2020-10-21 23:15:30 +00:00
realbigsean
fdb9744759 use head slot instead of the target slot for the not_while_syncing fi… (#1802)
## Issue Addressed

Resolves #1792

## Proposed Changes

Use `chain.best_slot()` instead of the sync state's target slot in the `not_while_syncing_filter`

## Additional Info

N/A
2020-10-21 22:02:25 +00:00
Paul Hauner
02d94a70b7 Allow VC to start without any validators (#1779)
## Issue Addressed

NA

## Proposed Changes

- Don't exit early if the VC is without any validators.
- When there are no validators, always create the slashing database (even without `--init-slashing-protection`).
2020-10-21 04:29:24 +00:00
divma
2acf75785c More sync updates (#1791)
## Issue Addressed
#1614 and a couple of sync-stalling problems, the most important is a cyclic dependency between the sync manager and the peer manager
2020-10-20 22:34:18 +00:00
Michael Sproul
703c33bdc7 Fix head tracker concurrency bugs (#1771)
## Issue Addressed

Closes #1557

## Proposed Changes

Modify the pruning algorithm so that it mutates the head-tracker _before_ committing the database transaction to disk, and _only if_ all the heads to be removed are still present in the head-tracker (i.e. no concurrent mutations).

In the process of writing and testing this I also had to make a few other changes:

* Use internal mutability for all `BeaconChainHarness` functions (namely the RNG and the graffiti), in order to enable parallel calls (see testing section below).
* Disable logging in harness tests unless the `test_logger` feature is turned on

And chose to make some clean-ups:

* Delete the `NullMigrator`
* Remove type-based configuration for the migrator in favour of runtime config (simpler, less duplicated code)
* Use the non-blocking migrator unless the blocking migrator is required. In the store tests we need the blocking migrator because some tests make asserts about the state of the DB after the migration has run.
* Rename `validators_keypairs` -> `validator_keypairs` in the `BeaconChainHarness`

## Testing

To confirm that the fix worked, I wrote a test using [Hiatus](https://crates.io/crates/hiatus), which can be found here:

https://github.com/michaelsproul/lighthouse/tree/hiatus-issue-1557

That test can't be merged because it inserts random breakpoints everywhere, but if you check out that branch you can run the test with:

```
$ cd beacon_node/beacon_chain
$ cargo test --release --test parallel_tests --features test_logger
```

It should pass, and the log output should show:

```
WARN Pruning deferred because of a concurrent mutation, message: this is expected only very rarely!
```

## Additional Info

This is a backwards-compatible change with no impact on consensus.
2020-10-19 05:58:39 +00:00
blacktemplar
6ba997b88e add direction information to PeerInfo (#1768)
## Issue Addressed

NA

## Proposed Changes

Adds a direction field to `PeerConnectionStatus` that can be accessed by calling `is_outgoing` which will return `true` iff the peer is connected and the first connection was an outgoing one.
2020-10-16 05:24:21 +00:00
Herman Junge
d7b9d0dd9f Implement matches! macro (#1777)
Fix #1775
2020-10-15 21:42:43 +00:00