* p2p: add DialRatio for configuration of inbound vs. dialed connections
* p2p: add connection flags to PeerInfo
* p2p/netutil: add SameNet, DistinctNetSet
* p2p/discover: improve revalidation and seeding
This changes node revalidation to be periodic instead of on-demand. This
should prevent issues where dead nodes get stuck in closer buckets
because no other node will ever come along to replace them.
Every 5 seconds (on average), the last node in a random bucket is
checked and moved to the front of the bucket if it is still responding.
If revalidation fails, the last node is replaced by an entry of the
'replacement list' containing recently-seen nodes.
Most close buckets are removed because it's very unlikely we'll ever
encounter a node that would fall into any of those buckets.
Table seeding is also improved: we now require a few minutes of table
membership before considering a node as a potential seed node. This
should make it less likely to store short-lived nodes as potential
seeds.
* p2p/discover: fix nits in UDP transport
We would skip sending neighbors replies if there were fewer than
maxNeighbors results and CheckRelayIP returned an error for the last
one. While here, also resolve a TODO about pong reply tokens.
p2p/simulations: introduce dialBan
- Refactor simulations/network connection getters to support
avoiding simultaneous dials between two peers If two peers dial
simultaneously, the connection will be dropped to help avoid
that, we essentially lock the connection object with a
timestamp which serves as a ban on dialing for a period of time
(dialBanTimeout).
- The connection getter InitConn can be wrapped and passed to the
nodes via adapters.NodeConfig#Reachable field and then used by
the respective services when they initiate connections. This
massively stablise the emerging connectivity when running with
hundreds of nodes bootstrapping a network.
p2p: add Inbound public method to p2p.Peer
p2p/simulations: Add server id to logs to support debugging
in-memory network simulations when multiple peers are logging.
p2p: SetupConn now returns error. The dialer checks the error and
only calls resolve if the actual TCP dial fails.
This commit introduces a network simulation framework which
can be used to run simulated networks of devp2p nodes. The
intention is to use this for testing protocols, performing
benchmarks and visualising emergent network behaviour.
Using a Timer over Ticker seems to be a lot better, though I cannot fully
account for why that it behaves so (since Ticker should be more bursty, but not
necessarily more active over time, but that may depend on how long window it
uses to decide on when to tick next)
As of this commit, we no longer rely on the protocol handler to report
write errors in a timely fashion. When a write fails, shutdown is
initiated immediately and no new writes can start. This will also
prevent new writes from starting after Server.Stop has been called.
The most visible change is event-based dialing, which should be an
improvement over the timer-based system that we have at the moment.
The dialer gets a chance to compute new tasks whenever peers change or
dials complete. This is better than checking peers on a timer because
dials happen faster. The dialer can now make more precise decisions
about whom to dial based on the peer set and we can test those
decisions without actually opening any sockets.
Peer management is easier to test because the tests can inject
connections at checkpoints (after enc handshake, after protocol
handshake).
Most of the handshake stuff is now part of the RLPx code. It could be
exported or move to its own package because it is no longer entangled
with Server logic.
The previous limit was 10MB which is unacceptable for all kinds
of reasons, the most important one being that we don't want to
allow the remote side to make us allocate 10MB at handshake time.
The returned reason is currently not used except for the log
message. This change makes the log messages a bit more useful.
The handshake code also returns the remote reason.
Peer.readLoop will only terminate if the connection is closed. Fix the
hang by closing the connection before waiting for readLoop to terminate.
This also removes the british disconnect procedure where we're waiting
for the remote end to close the connection. I have confirmed with
@subtly that cpp-ethereum doesn't adhere to it either.
There were multiple synchronization issues in the disconnect handling,
all caused by the odd special-casing of Peer.readLoop errors. Remove the
special handling of read errors and make readLoop part of the Peer
WaitGroup.
Thanks to @Gustav-Simonsson for pointing at arrows in a diagram
and playing rubber-duck.
Message encoding functions have been renamed to catch any uses.
The switch to the new encoder can cause subtle incompatibilities.
If there are any users outside of our tree, they will at least be
alerted that there was a change.
NewMsg no longer exists. The replacements for EncodeMsg are called
Send and SendItems.
With RLPx frames, the message code is contained in the
frame and is no longer part of the encoded data.
EncodeMsg, Msg.Decode have been updated to match.
Code that decodes RLP directly from Msg.Payload will need
to change.
The diff is a bit bigger than expected because the protocol handshake
logic has moved out of Peer. This is necessary because the protocol
handshake will have custom framing in the final protocol.
There are now two deadlines, frameReadTimeout and payloadReadTimeout.
The frame timeout is longer and allows for connections that are idle.
The message timeout is still short and ensures that we don't get stuck
in the middle of a message.
Overview of changes:
- ClientIdentity has been removed, use discover.NodeID
- Server now requires a private key to be set (instead of public key)
- Server performs the encryption handshake before launching Peer
- Dial logic takes peers from discover table
- Encryption handshake code has been cleaned up a bit
- baseProtocol is gone because we don't exchange peers anymore
- Some parts of baseProtocol have moved into Peer instead
- add const length params for handshake messages
- add length check to fail early
- add debug logs to help interop testing (!ABSOLUTELY SHOULD BE DELETED LATER)
- wrap connection read/writes in error check
- add cryptoReady channel in peer to signal when secure session setup is finished
- wait for cryptoReady or timeout in TestPeersHandshake
- abstract the entire handshake logic in cryptoId.Run() taking session-relevant parameters
- changes in peer to accomodate how the encryption layer would be switched on
- modify arguments of handshake components
- fixed test getting the wrong pubkey but it till crashes on DH in newSession()