Commit Graph

72 Commits

Author SHA1 Message Date
Felix Lange
b4374436f3 p2p/discover: fix race involving the seed node iterator
nodeDB.querySeeds was not safe for concurrent use but could be called
concurrenty on multiple goroutines in the following case:

- the table was empty
- a timed refresh started
- a lookup was started and initiated refresh

These conditions are unlikely to coincide during normal use, but are
much more likely to occur all at once when the user's machine just woke
from sleep. The root cause of the issue is that querySeeds reused the
same leveldb iterator until it was exhausted.

This commit moves the refresh scheduling logic into its own goroutine
(so only one refresh is ever active) and changes querySeeds to not use
a persistent iterator. The seed node selection is now more random and
ignores nodes that have not been contacted in the last 5 days.
2015-09-30 16:23:03 +02:00
Jeffrey Wilcke
e2d44814a5 Merge pull request #1694 from obscuren/hide-fdtrack
fdtrack: hide message
2015-08-19 13:50:54 -07:00
Jeffrey Wilcke
269c5c7107 Revert "fdtrack: temporary hack for tracking file descriptor usage"
This reverts commit 5c949d3b3b.
2015-08-19 21:46:01 +02:00
Felix Lange
dd54fef898 p2p/discover: don't attempt to replace nodes that are being replaced
PR #1621 changed Table locking so the mutex is not held while a
contested node is being pinged. If multiple nodes ping the local node
during this time window, multiple ping packets will be sent to the
contested node. The changes in this commit prevent multiple packets by
tracking whether the node is being replaced.
2015-08-19 14:57:16 +02:00
Felix Lange
7d5ff770e2 p2p/discover: continue reading after temporary errors
Might solve #1579
2015-08-19 14:38:55 +02:00
Felix Lange
590c99a98f p2p/discover: fix UDP reply packet timeout handling
If the timeout fired (even just nanoseconds) before the deadline of the
next pending reply, the timer was not rescheduled. The timer would've
been rescheduled anyway once the next packet was sent, but there were
cases where no next packet could ever be sent due to the locking issue
fixed in the previous commit.

As timing-related bugs go, this issue had been present for a long time
and I could never reproduce it. The test added in this commit did
reproduce the issue on about one out of 15 runs.
2015-08-11 11:42:17 +02:00
Felix Lange
01ed3fa1a9 p2p/discover: unlock the table during ping replacement
Table.mutex was being held while waiting for a reply packet, which
effectively made many parts of the whole stack block on that packet,
including the net_peerCount RPC call.
2015-08-11 11:42:17 +02:00
Felix Lange
b23b4dbd79 p2p/discover: close Table during testing
Not closing the table used to be fine, but now the table has a database.
2015-08-06 12:27:59 +02:00
Felix Lange
5c949d3b3b fdtrack: temporary hack for tracking file descriptor usage
Package fdtrack logs statistics about open file descriptors.
This should help identify the source of #1549.
2015-08-04 03:10:27 +02:00
Felix Lange
bfbcfbe4a9 all: fix license headers one more time
I forgot to update one instance of "go-ethereum" in commit 3f047be5a.
2015-07-23 18:35:11 +02:00
Felix Lange
3f047be5aa all: update license headers to distiguish GPL/LGPL
All code outside of cmd/ is licensed as LGPL. The headers
now reflect this by calling the whole work "the go-ethereum library".
2015-07-22 18:51:45 +02:00
Felix Lange
ea54283b30 all: update license information 2015-07-07 14:12:44 +02:00
Felix Lange
a8e4cb6dfe p2p/discover: use separate rand.Source instances in tests
rand.Source isn't safe for concurrent use.
2015-06-10 15:18:01 +02:00
Felix Lange
261a8077c4 p2p/discover: deflake TestUDP_successfulPing 2015-06-10 13:08:21 +02:00
Péter Szilágyi
612f01400f p2p/discover: bond with seed nodes too (runs only if findnode failed) 2015-05-26 23:30:41 +02:00
Péter Szilágyi
3630432dfb p2p/discovery: fix a cornercase loop if no seeds or bootnodes are known 2015-05-26 23:30:40 +02:00
Péter Szilágyi
f539ed1e66 p2p/discover: force refresh if the table is empty 2015-05-26 23:30:40 +02:00
Péter Szilágyi
5076170f34 p2p/discover: permit temporary bond failures for previously known nodes 2015-05-26 23:30:40 +02:00
Péter Szilágyi
6078aa08eb p2p/discover: watch find failures, evacuate on too many, rebond if failed 2015-05-26 23:30:40 +02:00
Péter Szilágyi
64174f196f p2p/discover: add support for counting findnode failures 2015-05-26 23:30:40 +02:00
Felix Lange
9f38ef5d97 p2p/discover: add ReadRandomNodes 2015-05-25 01:17:14 +02:00
Péter Szilágyi
cbd3ae6906 p2p/discover: fix #838, evacuate self entries from the node db 2015-05-21 19:41:46 +03:00
Péter Szilágyi
af24c271c7 p2p/discover: fix database presistency test folder 2015-05-21 19:28:10 +03:00
Felix Lange
d2f119cf9b p2p/discover: limit open files for node database 2015-05-14 15:01:13 +02:00
Felix Lange
7fa2607bd1 p2p/discover: bump maxBondingPingPongs to 16
This should increase the speed a bit because all findnode
results (up to 16) can be verified at the same time.
2015-05-14 14:53:29 +02:00
Felix Lange
251846d65a p2p/discover: fix out-of-bounds slicing for chunked neighbors packets
The code assumed that Table.closest always returns at least 13 nodes.
This is not true for small tables (e.g. during bootstrap).
2015-05-13 21:49:04 +02:00
subtly
8eef2b765a fix test. 2015-05-13 20:15:01 +02:00
subtly
a32693770c Manual send of multiple neighbours packets. Test receiving multiple neighbours packets. 2015-05-13 20:03:17 +02:00
subtly
7473c93668 UDP Interop. Limit datagrams to 1280bytes.
We don't have a UDP which specifies any messages that will be 4KB. Aside from being implemented for months and a necessity for encryption and piggy-backing packets, 1280bytes is ideal, and, means this TODO can be completed!

Why 1280 bytes?
* It's less than the default MTU for most WAN/LAN networks. That means fewer fragmented datagrams (esp on well-connected networks).
* Fragmented datagrams and dropped packets suck and add latency while OS waits for a dropped fragment to never arrive (blocking readLoop())
* Most of our packets are < 1280 bytes.
* 1280 bytes is minimum datagram size and MTU for IPv6 -- on IPv6, a datagram < 1280bytes will *never* be fragmented.

UDP datagrams are dropped. A lot! And fragmented datagrams are worse. If a datagram has a 30% chance of being dropped, then a fragmented datagram has a 60% chance of being dropped. More importantly, we have signed packets and can't do anything with a packet unless we receive the entire datagram because the signature can't be verified. The same is true when we have encrypted packets.

So the solution here to picking an ideal buffer size for receiving datagrams is a number under 1400bytes. And the lower-bound value for IPv6 of 1280 bytes make's it a non-decision. On IPv4 most ISPs and 3g/4g/let networks have an MTU just over 1400 -- and *never* over 1500. Never -- that means packets over 1500 (in reality: ~1450) bytes are fragmented. And probably dropped a lot.

Just to prove the point, here are pings sending non-fragmented packets over wifi/ISP, and a second set of pings via cell-phone tethering. It's important to note that, if *any* router between my system and the EC2 node has a lower MTU, the message would not go through:

On wifi w/normal ISP:
localhost:Debug $ ping -D -s 1450 52.6.250.242
PING 52.6.250.242 (52.6.250.242): 1450 data bytes
1458 bytes from 52.6.250.242: icmp_seq=0 ttl=42 time=104.831 ms
1458 bytes from 52.6.250.242: icmp_seq=1 ttl=42 time=119.004 ms
^C
--- 52.6.250.242 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 104.831/111.918/119.004/7.087 ms
localhost:Debug $ ping -D -s 1480 52.6.250.242
PING 52.6.250.242 (52.6.250.242): 1480 data bytes
ping: sendto: Message too long
ping: sendto: Message too long
Request timeout for icmp_seq 0
ping: sendto: Message too long
Request timeout for icmp_seq 1


Tethering to O2:
localhost:Debug $ ping -D -s 1480 52.6.250.242
PING 52.6.250.242 (52.6.250.242): 1480 data bytes
ping: sendto: Message too long
ping: sendto: Message too long
Request timeout for icmp_seq 0
^C
--- 52.6.250.242 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
localhost:Debug $ ping -D -s 1450 52.6.250.242
PING 52.6.250.242 (52.6.250.242): 1450 data bytes
1458 bytes from 52.6.250.242: icmp_seq=0 ttl=42 time=107.844 ms
1458 bytes from 52.6.250.242: icmp_seq=1 ttl=42 time=105.127 ms
1458 bytes from 52.6.250.242: icmp_seq=2 ttl=42 time=120.483 ms
1458 bytes from 52.6.250.242: icmp_seq=3 ttl=42 time=102.136 ms
2015-05-13 19:03:00 +02:00
Bas van Kervel
95773b9673 removed redundant newlines in import block 2015-05-12 15:20:53 +02:00
Bas van Kervel
b79dd188d9 replaced several path.* with filepath.* which is platform independent 2015-05-12 14:24:11 +02:00
Felix Lange
bcfd788661 p2p/discover: bump packet timeouts to 500ms 2015-05-06 22:59:00 +02:00
Felix Lange
2adcc31bb4 p2p/discover: new distance metric based on sha3(id)
The previous metric was pubkey1^pubkey2, as specified in the Kademlia
paper. We missed that EC public keys are not uniformly distributed.
Using the hash of the public keys addresses that. It also makes it
a bit harder to generate node IDs that are close to a particular node.
2015-05-06 16:10:41 +02:00
Felix Lange
72ab6d3255 p2p/discover: track sha3(ID) in Node 2015-04-30 15:02:23 +02:00
Felix Lange
b34a8ef624 p2p, p2p/discover: protocol version 4 2015-04-30 14:57:34 +02:00
Felix Lange
fc747ef4a6 p2p/discover: new endpoint format
This commit changes the discovery protocol to use the new "v4" endpoint
format, which allows for separate UDP and TCP ports and makes it
possible to discover the UDP address after NAT.
2015-04-30 14:57:33 +02:00
Péter Szilágyi
b569550a39 p2p/discover: fix api issues caused by leveldb update 2015-04-28 13:57:57 +03:00
Péter Szilágyi
4992765032 p2p/discover: fix goroutine leak due to blocking on sync.Once 2015-04-28 10:28:04 +03:00
Péter Szilágyi
437cf4b3ac p2p/discover: add node expirer and related tests 2015-04-27 17:38:28 +03:00
Péter Szilágyi
a136e2bb22 p2p/discover: parametrize nodedb version, add persistency tests 2015-04-27 15:28:17 +03:00
Péter Szilágyi
75fd738dea p2p/discover: drop a superfluous warning 2015-04-27 15:06:31 +03:00
Péter Szilágyi
706da56f75 p2p/discover: wrap the pinger to update the node db too 2015-04-27 14:56:42 +03:00
Péter Szilágyi
85b4b44235 p2p/discover: use iterator based seeding, drop old protocol test 2015-04-27 14:45:35 +03:00
Péter Szilágyi
8de8f61d36 p2p/discover: write the basic tests, catch RLP bug 2015-04-27 12:33:06 +03:00
Péter Szilágyi
0201c04b95 p2p/discovery: fix issues raised in the nodeDb PR 2015-04-27 10:19:16 +03:00
Péter Szilágyi
8646365b42 cmd/bootnode, eth, p2p, p2p/discover: use a fancier db design 2015-04-24 18:04:41 +03:00
Péter Szilágyi
6def110c37 cmd/bootnode, eth, p2p, p2p/discover: clean up the seeder and mesh into eth. 2015-04-24 11:33:55 +03:00
Péter Szilágyi
971702e7a1 p2p/discovery: fix broken tests due to API update 2015-04-24 11:23:20 +03:00
Péter Szilágyi
af923c965f p2p/discovery: use the seed table for finding nodes, auto drop stale ones 2015-04-24 11:23:20 +03:00
Péter Szilágyi
5f735d6fce cmd, eth, p2p, p2p/discover: init and clean up the seed cache 2015-04-24 11:23:20 +03:00