lighthouse/beacon_node/eth2-libp2p/src/discovery.rs
Age Manning 95c8e476bc
Initial work towards v0.2.0 (#924)
* Remove ping protocol

* Initial renaming of network services

* Correct rebasing relative to latest master

* Start updating types

* Adds HashMapDelay struct to utils

* Initial network restructure

* Network restructure. Adds new types for v0.2.0

* Removes build artefacts

* Shift validation to beacon chain

* Temporarily remove gossip validation

This is to be updated to match current optimisation efforts.

* Adds AggregateAndProof

* Begin rebuilding pubsub encoding/decoding

* Signature hacking

* Shift gossipsup decoding into eth2_libp2p

* Existing EF tests passing with fake_crypto

* Shifts block encoding/decoding into RPC

* Delete outdated API spec

* All release tests passing bar genesis state parsing

* Update and test YamlConfig

* Update to spec v0.10 compatible BLS

* Updates to BLS EF tests

* Add EF test for AggregateVerify

And delete unused hash2curve tests for uncompressed points

* Update EF tests to v0.10.1

* Use optional block root correctly in block proc

* Use genesis fork in deposit domain. All tests pass

* Fast aggregate verify test

* Update REST API docs

* Fix unused import

* Bump spec tags to v0.10.1

* Add `seconds_per_eth1_block` to chainspec

* Update to timestamp based eth1 voting scheme

* Return None from `get_votes_to_consider` if block cache is empty

* Handle overflows in `is_candidate_block`

* Revert to failing tests

* Fix eth1 data sets test

* Choose default vote according to spec

* Fix collect_valid_votes tests

* Fix `get_votes_to_consider` to choose all eligible blocks

* Uncomment winning_vote tests

* Add comments; remove unused code

* Reduce seconds_per_eth1_block for simulation

* Addressed review comments

* Add test for default vote case

* Fix logs

* Remove unused functions

* Meter default eth1 votes

* Fix comments

* Progress on attestation service

* Address review comments; remove unused dependency

* Initial work on removing libp2p lock

* Add LRU caches to store (rollup)

* Update attestation validation for DB changes (WIP)

* Initial version of should_forward_block

* Scaffold

* Progress on attestation validation

Also, consolidate prod+testing slot clocks so that they share much
of the same implementation and can both handle sub-slot time changes.

* Removes lock from libp2p service

* Completed network lock removal

* Finish(?) attestation processing

* Correct network termination future

* Add slot check to block check

* Correct fmt issues

* Remove Drop implementation for network service

* Add first attempt at attestation proc. re-write

* Add version 2 of attestation processing

* Minor fixes

* Add validator pubkey cache

* Make get_indexed_attestation take a committee

* Link signature processing into new attn verification

* First working version

* Ensure pubkey cache is updated

* Add more metrics, slight optimizations

* Clone committee cache during attestation processing

* Update shuffling cache during block processing

* Remove old commented-out code

* Fix shuffling cache insert bug

* Used indexed attestation in fork choice

* Restructure attn processing, add metrics

* Add more detailed metrics

* Tidy, fix failing tests

* Fix failing tests, tidy

* Address reviewers suggestions

* Disable/delete two outdated tests

* Modification of validator for subscriptions

* Add slot signing to validator client

* Further progress on validation subscription

* Adds necessary validator subscription functionality

* Add new Pubkeys struct to signature_sets

* Refactor with functional approach

* Update beacon chain

* Clean up validator <-> beacon node http types

* Add aggregator status to ValidatorDuty

* Impl Clone for manual slot clock

* Fix minor errors

* Further progress validator client subscription

* Initial subscription and aggregation handling

* Remove decompressed member from pubkey bytes

* Progress to modifying val client for attestation aggregation

* First draft of validator client upgrade for aggregate attestations

* Add hashmap for indices lookup

* Add state cache, remove store cache

* Only build the head committee cache

* Removes lock on a network channel

* Partially implement beacon node subscription http api

* Correct compilation issues

* Change `get_attesting_indices` to use Vec

* Fix failing test

* Partial implementation of timer

* Adds timer, removes exit_future, http api to op pool

* Partial multiple aggregate attestation handling

* Permits bulk messages accross gossipsub network channel

* Correct compile issues

* Improve gosispsub messaging and correct rest api helpers

* Added global gossipsub subscriptions

* Update validator subscriptions data structs

* Tidy

* Re-structure validator subscriptions

* Initial handling of subscriptions

* Re-structure network service

* Add pubkey cache persistence file

* Add more comments

* Integrate persistence file into builder

* Add pubkey cache tests

* Add HashSetDelay and introduce into attestation service

* Handles validator subscriptions

* Add data_dir to beacon chain builder

* Remove Option in pubkey cache persistence file

* Ensure consistency between datadir/data_dir

* Fix failing network test

* Peer subnet discovery gets queued for future subscriptions

* Reorganise attestation service functions

* Initial wiring of attestation service

* First draft of attestation service timing logic

* Correct minor typos

* Tidy

* Fix todos

* Improve tests

* Add PeerInfo to connected peers mapping

* Fix compile error

* Fix compile error from merge

* Split up block processing metrics

* Tidy

* Refactor get_pubkey_from_state

* Remove commented-out code

* Rename state_cache -> checkpoint_cache

* Rename Checkpoint -> Snapshot

* Tidy, add comments

* Tidy up find_head function

* Change some checkpoint -> snapshot

* Add tests

* Expose max_len

* Remove dead code

* Tidy

* Fix bug

* Add sync-speed metric

* Add first attempt at VerifiableBlock

* Start integrating into beacon chain

* Integrate VerifiableBlock

* Rename VerifableBlock -> PartialBlockVerification

* Add start of typed methods

* Add progress

* Add further progress

* Rename structs

* Add full block verification to block_processing.rs

* Further beacon chain integration

* Update checks for gossip

* Add todo

* Start adding segement verification

* Add passing chain segement test

* Initial integration with batch sync

* Minor changes

* Tidy, add more error checking

* Start adding chain_segment tests

* Finish invalid signature tests

* Include single and gossip verified blocks in tests

* Add gossip verification tests

* Start adding docs

* Finish adding comments to block_processing.rs

* Rename block_processing.rs -> block_verification

* Start removing old block processing code

* Fixes beacon_chain compilation

* Fix project-wide compile errors

* Remove old code

* Correct code to pass all tests

* Fix bug with beacon proposer index

* Fix shim for BlockProcessingError

* Only process one epoch at a time

* Fix loop in chain segment processing

* Correct tests from master merge

* Add caching for state.eth1_data_votes

* Add BeaconChain::validator_pubkey

* Revert "Add caching for state.eth1_data_votes"

This reverts commit cd73dcd6434fb8d8e6bf30c5356355598ea7b78e.

Co-authored-by: Grant Wuerker <gwuerker@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
2020-03-17 17:24:44 +11:00

395 lines
15 KiB
Rust

use crate::metrics;
use crate::{error, NetworkConfig, NetworkGlobals, PeerInfo};
/// This manages the discovery and management of peers.
///
/// Currently using discv5 for peer discovery.
///
use futures::prelude::*;
use libp2p::core::{identity::Keypair, ConnectedPoint, Multiaddr, PeerId};
use libp2p::discv5::{Discv5, Discv5Event};
use libp2p::enr::{Enr, EnrBuilder, NodeId};
use libp2p::multiaddr::Protocol;
use libp2p::swarm::{NetworkBehaviour, NetworkBehaviourAction, PollParameters, ProtocolsHandler};
use slog::{debug, info, warn};
use std::collections::HashSet;
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
use std::str::FromStr;
use std::sync::Arc;
use std::time::{Duration, Instant};
use tokio::io::{AsyncRead, AsyncWrite};
use tokio::timer::Delay;
use types::EthSpec;
/// Maximum seconds before searching for extra peers.
const MAX_TIME_BETWEEN_PEER_SEARCHES: u64 = 120;
/// Initial delay between peer searches.
const INITIAL_SEARCH_DELAY: u64 = 5;
/// Local ENR storage filename.
const ENR_FILENAME: &str = "enr.dat";
/// Lighthouse discovery behaviour. This provides peer management and discovery using the Discv5
/// libp2p protocol.
pub struct Discovery<TSubstream, TSpec: EthSpec> {
/// The currently banned peers.
banned_peers: HashSet<PeerId>,
/// The target number of connected peers on the libp2p interface.
max_peers: usize,
/// The directory where the ENR is stored.
enr_dir: String,
/// The delay between peer discovery searches.
peer_discovery_delay: Delay,
/// Tracks the last discovery delay. The delay is doubled each round until the max
/// time is reached.
past_discovery_delay: u64,
/// The TCP port for libp2p. Used to convert an updated IP address to a multiaddr. Note: This
/// assumes that the external TCP port is the same as the internal TCP port if behind a NAT.
//TODO: Improve NAT handling limit the above restriction
tcp_port: u16,
/// The discovery behaviour used to discover new peers.
discovery: Discv5<TSubstream>,
/// A collection of network constants that can be read from other threads.
network_globals: Arc<NetworkGlobals<TSpec>>,
/// Logger for the discovery behaviour.
log: slog::Logger,
}
impl<TSubstream, TSpec: EthSpec> Discovery<TSubstream, TSpec> {
pub fn new(
local_key: &Keypair,
config: &NetworkConfig,
network_globals: Arc<NetworkGlobals<TSpec>>,
log: &slog::Logger,
) -> error::Result<Self> {
let log = log.clone();
// checks if current ENR matches that found on disk
let local_enr = load_enr(local_key, config, &log)?;
*network_globals.local_enr.write() = Some(local_enr.clone());
let enr_dir = match config.network_dir.to_str() {
Some(path) => String::from(path),
None => String::from(""),
};
info!(log, "ENR Initialised"; "enr" => local_enr.to_base64(), "seq" => local_enr.seq(), "id"=> format!("{}",local_enr.node_id()), "ip" => format!("{:?}", local_enr.ip()), "udp"=> local_enr.udp().unwrap_or_else(|| 0), "tcp" => local_enr.tcp().unwrap_or_else(|| 0));
// the last parameter enables IP limiting. 2 Nodes on the same /24 subnet per bucket and 10
// nodes on the same /24 subnet per table.
// TODO: IP filtering is currently disabled for the DHT. Enable for production
let mut discovery = Discv5::new(local_enr, local_key.clone(), config.listen_address, false)
.map_err(|e| format!("Discv5 service failed. Error: {:?}", e))?;
// Add bootnodes to routing table
for bootnode_enr in config.boot_nodes.clone() {
debug!(
log,
"Adding node to routing table";
"node_id" => format!("{}",
bootnode_enr.node_id())
);
discovery.add_enr(bootnode_enr);
}
Ok(Self {
banned_peers: HashSet::new(),
max_peers: config.max_peers,
peer_discovery_delay: Delay::new(Instant::now()),
past_discovery_delay: INITIAL_SEARCH_DELAY,
tcp_port: config.libp2p_port,
discovery,
network_globals,
log,
enr_dir,
})
}
/// Return the nodes local ENR.
pub fn local_enr(&self) -> &Enr {
self.discovery.local_enr()
}
/// Manually search for peers. This restarts the discovery round, sparking multiple rapid
/// queries.
pub fn discover_peers(&mut self) {
self.past_discovery_delay = INITIAL_SEARCH_DELAY;
self.find_peers();
}
/// Add an ENR to the routing table of the discovery mechanism.
pub fn add_enr(&mut self, enr: Enr) {
self.discovery.add_enr(enr);
}
/// The peer has been banned. Add this peer to the banned list to prevent any future
/// re-connections.
// TODO: Remove the peer from the DHT if present
pub fn peer_banned(&mut self, peer_id: PeerId) {
self.banned_peers.insert(peer_id);
}
pub fn peer_unbanned(&mut self, peer_id: &PeerId) {
self.banned_peers.remove(peer_id);
}
/// Returns an iterator over all enr entries in the DHT.
pub fn enr_entries(&mut self) -> impl Iterator<Item = &Enr> {
self.discovery.enr_entries()
}
/// Search for new peers using the underlying discovery mechanism.
fn find_peers(&mut self) {
// pick a random NodeId
let random_node = NodeId::random();
debug!(self.log, "Searching for peers");
self.discovery.find_node(random_node);
}
}
// Redirect all behaviour events to underlying discovery behaviour.
impl<TSubstream, TSpec: EthSpec> NetworkBehaviour for Discovery<TSubstream, TSpec>
where
TSubstream: AsyncRead + AsyncWrite,
{
type ProtocolsHandler = <Discv5<TSubstream> as NetworkBehaviour>::ProtocolsHandler;
type OutEvent = <Discv5<TSubstream> as NetworkBehaviour>::OutEvent;
fn new_handler(&mut self) -> Self::ProtocolsHandler {
NetworkBehaviour::new_handler(&mut self.discovery)
}
fn addresses_of_peer(&mut self, peer_id: &PeerId) -> Vec<Multiaddr> {
// Let discovery track possible known peers.
self.discovery.addresses_of_peer(peer_id)
}
fn inject_connected(&mut self, peer_id: PeerId, _endpoint: ConnectedPoint) {
// TODO: Search for a known ENR once discv5 is updated.
self.network_globals
.connected_peer_set
.write()
.insert(peer_id, PeerInfo::new());
// TODO: Drop peers if over max_peer limit
metrics::inc_counter(&metrics::PEER_CONNECT_EVENT_COUNT);
metrics::set_gauge(
&metrics::PEERS_CONNECTED,
self.network_globals.connected_peers() as i64,
);
}
fn inject_disconnected(&mut self, peer_id: &PeerId, _endpoint: ConnectedPoint) {
self.network_globals
.connected_peer_set
.write()
.remove(peer_id);
metrics::inc_counter(&metrics::PEER_DISCONNECT_EVENT_COUNT);
metrics::set_gauge(
&metrics::PEERS_CONNECTED,
self.network_globals.connected_peers() as i64,
);
}
fn inject_replaced(
&mut self,
_peer_id: PeerId,
_closed: ConnectedPoint,
_opened: ConnectedPoint,
) {
// discv5 doesn't implement
}
fn inject_node_event(
&mut self,
_peer_id: PeerId,
_event: <Self::ProtocolsHandler as ProtocolsHandler>::OutEvent,
) {
// discv5 doesn't implement
}
fn poll(
&mut self,
params: &mut impl PollParameters,
) -> Async<
NetworkBehaviourAction<
<Self::ProtocolsHandler as ProtocolsHandler>::InEvent,
Self::OutEvent,
>,
> {
// search for peers if it is time
loop {
match self.peer_discovery_delay.poll() {
Ok(Async::Ready(_)) => {
if self.network_globals.connected_peers() < self.max_peers {
self.find_peers();
}
// Set to maximum, and update to earlier, once we get our results back.
self.peer_discovery_delay.reset(
Instant::now() + Duration::from_secs(MAX_TIME_BETWEEN_PEER_SEARCHES),
);
}
Ok(Async::NotReady) => break,
Err(e) => {
warn!(self.log, "Discovery peer search failed"; "error" => format!("{:?}", e));
}
}
}
// Poll discovery
loop {
match self.discovery.poll(params) {
Async::Ready(NetworkBehaviourAction::GenerateEvent(event)) => {
match event {
Discv5Event::Discovered(_enr) => {
// not concerned about FINDNODE results, rather the result of an entire
// query.
}
Discv5Event::SocketUpdated(socket) => {
info!(self.log, "Address updated"; "ip" => format!("{}",socket.ip()), "udp_port" => format!("{}", socket.port()));
metrics::inc_counter(&metrics::ADDRESS_UPDATE_COUNT);
let mut address = Multiaddr::from(socket.ip());
address.push(Protocol::Tcp(self.tcp_port));
let enr = self.discovery.local_enr();
save_enr_to_disc(Path::new(&self.enr_dir), enr, &self.log);
return Async::Ready(NetworkBehaviourAction::ReportObservedAddr {
address,
});
}
Discv5Event::FindNodeResult { closer_peers, .. } => {
debug!(self.log, "Discovery query completed"; "peers_found" => closer_peers.len());
// update the time to the next query
if self.past_discovery_delay < MAX_TIME_BETWEEN_PEER_SEARCHES {
self.past_discovery_delay *= 2;
}
let delay = std::cmp::max(
self.past_discovery_delay,
MAX_TIME_BETWEEN_PEER_SEARCHES,
);
self.peer_discovery_delay
.reset(Instant::now() + Duration::from_secs(delay));
if closer_peers.is_empty() {
debug!(self.log, "Discovery random query found no peers");
}
for peer_id in closer_peers {
// if we need more peers, attempt a connection
if self.network_globals.connected_peers() < self.max_peers
&& self
.network_globals
.connected_peer_set
.read()
.get(&peer_id)
.is_none()
&& !self.banned_peers.contains(&peer_id)
{
debug!(self.log, "Peer discovered"; "peer_id"=> format!("{:?}", peer_id));
return Async::Ready(NetworkBehaviourAction::DialPeer {
peer_id,
});
}
}
}
_ => {}
}
}
// discv5 does not output any other NetworkBehaviourAction
Async::Ready(_) => {}
Async::NotReady => break,
}
}
Async::NotReady
}
}
/// Loads an ENR from file if it exists and matches the current NodeId and sequence number. If none
/// exists, generates a new one.
///
/// If an ENR exists, with the same NodeId and IP address, we use the disk-generated one as its
/// ENR sequence will be equal or higher than a newly generated one.
fn load_enr(
local_key: &Keypair,
config: &NetworkConfig,
log: &slog::Logger,
) -> Result<Enr, String> {
// Build the local ENR.
// Note: Discovery should update the ENR record's IP to the external IP as seen by the
// majority of our peers.
let mut local_enr = EnrBuilder::new("v4")
.ip(config
.discovery_address
.unwrap_or_else(|| "127.0.0.1".parse().expect("valid ip")))
.tcp(config.libp2p_port)
.udp(config.discovery_port)
.build(&local_key)
.map_err(|e| format!("Could not build Local ENR: {:?}", e))?;
let enr_f = config.network_dir.join(ENR_FILENAME);
if let Ok(mut enr_file) = File::open(enr_f.clone()) {
let mut enr_string = String::new();
match enr_file.read_to_string(&mut enr_string) {
Err(_) => debug!(log, "Could not read ENR from file"),
Ok(_) => {
match Enr::from_str(&enr_string) {
Ok(enr) => {
if enr.node_id() == local_enr.node_id() {
if (config.discovery_address.is_none()
|| enr.ip().map(Into::into) == config.discovery_address)
&& enr.tcp() == Some(config.libp2p_port)
&& enr.udp() == Some(config.discovery_port)
{
debug!(log, "ENR loaded from file"; "file" => format!("{:?}", enr_f));
// the stored ENR has the same configuration, use it
return Ok(enr);
}
// same node id, different configuration - update the sequence number
let new_seq_no = enr.seq().checked_add(1).ok_or_else(|| "ENR sequence number on file is too large. Remove it to generate a new NodeId")?;
local_enr.set_seq(new_seq_no, local_key).map_err(|e| {
format!("Could not update ENR sequence number: {:?}", e)
})?;
debug!(log, "ENR sequence number increased"; "seq" => new_seq_no);
}
}
Err(e) => {
warn!(log, "ENR from file could not be decoded"; "error" => format!("{:?}", e));
}
}
}
}
}
save_enr_to_disc(&config.network_dir, &local_enr, log);
Ok(local_enr)
}
fn save_enr_to_disc(dir: &Path, enr: &Enr, log: &slog::Logger) {
let _ = std::fs::create_dir_all(dir);
match File::create(dir.join(Path::new(ENR_FILENAME)))
.and_then(|mut f| f.write_all(&enr.to_base64().as_bytes()))
{
Ok(_) => {
debug!(log, "ENR written to disk");
}
Err(e) => {
warn!(
log,
"Could not write ENR to file"; "file" => format!("{:?}{:?}",dir, ENR_FILENAME), "error" => format!("{}", e)
);
}
}
}