This commit affects p2p/discv5 "topic discovery" by running it on
the same UDP port where the old discovery works. This is realized
by giving an "unhandled" packet channel to the old v4 discovery
packet handler where all invalid packets are sent. These packets
are then processed by v5. v5 packets are always invalid when
interpreted by v4 and vice versa. This is ensured by adding one
to the first byte of the packet hash in v5 packets.
DiscoveryV5Bootnodes is also changed to point to new bootnodes
that are implementing the changed packet format with modified
hash. Existing and new v5 bootnodes are both running on different
ports ATM.
The p2p packages can now be configured to restrict all communication to
a certain subset of IP networks. This feature is meant to be used for
private networks.
The discovery DHT contains a number of hosts with LAN and loopback IPs.
These get relayed because some implementations do not perform any checks
on the IP.
go-ethereum already prevented relay in most cases because it verifies
that the host actually exists before adding it to the local table. But
this verification causes other issues. We have received several reports
where people's VPSs got shut down by hosting providers because sending
packets to random LAN hosts is indistinguishable from a slow port scan.
The new check prevents sending random packets to LAN by discarding LAN
IPs sent by Internet hosts (and loopback IPs from LAN and Internet
hosts). The new check also blacklists almost all currently registered
special-purpose networks assigned by IANA to avoid inciting random
responses from services in the LAN.
As another precaution against abuse of the DHT, ports below 1024 are now
considered invalid.
On Windows, UDPConn.ReadFrom returns an error for packets larger
than the receive buffer. The error is not marked temporary, causing
our loop to exit when the first oversized packet arrived. The fix
is to treat this particular error as temporary.
Fixes: #1579, #2087
Updates: #2082
If the timeout fired (even just nanoseconds) before the deadline of the
next pending reply, the timer was not rescheduled. The timer would've
been rescheduled anyway once the next packet was sent, but there were
cases where no next packet could ever be sent due to the locking issue
fixed in the previous commit.
As timing-related bugs go, this issue had been present for a long time
and I could never reproduce it. The test added in this commit did
reproduce the issue on about one out of 15 runs.
Table.mutex was being held while waiting for a reply packet, which
effectively made many parts of the whole stack block on that packet,
including the net_peerCount RPC call.
The previous metric was pubkey1^pubkey2, as specified in the Kademlia
paper. We missed that EC public keys are not uniformly distributed.
Using the hash of the public keys addresses that. It also makes it
a bit harder to generate node IDs that are close to a particular node.
This commit changes the discovery protocol to use the new "v4" endpoint
format, which allows for separate UDP and TCP ports and makes it
possible to discover the UDP address after NAT.
This a fix for an attack vector where the discovery protocol could be
used to amplify traffic in a DDOS attack. A malicious actor would send a
findnode request with the IP address and UDP port of the target as the
source address. The recipient of the findnode packet would then send a
neighbors packet (which is 16x the size of findnode) to the victim.
Our solution is to require a 'bond' with the sender of findnode. If no
bond exists, the findnode packet is not processed. A bond between nodes
α and β is created when α replies to a ping from β.
This (initial) version of the bonding implementation might still be
vulnerable against replay attacks during the expiration time window.
We will add stricter source address validation later.