Commit Graph

117 Commits

Author SHA1 Message Date
Janoš Guljaš
836c846812 swarm/network/master: protect SetNextBatch iterator after close (#19147) 2019-02-21 18:33:49 +01:00
Ferenc Szabo
e38b227ce6 Ci race detector handle failing tests (#19143)
* swarm/storage: increase mget timeout in common_test.go

 TestDbStoreCorrect_1k sometimes timed out with -race on Travis.

--- FAIL: TestDbStoreCorrect_1k (24.63s)
    common_test.go:194: testStore failed: timed out after 10s

* swarm: remove unused vars from TestSnapshotSyncWithServer

nodeCount and chunkCount is returned from setupSim and those values
we use.

* swarm: move race/norace helpers from stream to testutil

As we will need to use the flag in other packages, too.

* swarm: refactor TestSwarmNetwork case

Extract long running test cases for better visibility.

* swarm/network: skip TestSyncingViaGlobalSync with -race

As panics on Travis.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7e351b]

* swarm: run TestSwarmNetwork with fewer nodes with -race

As otherwise we always get test failure with `network_test.go:374:
context deadline exceeded` even with raised `Timeout`.

* swarm/network: run TestDeliveryFromNodes with fewer nodes with -race

Test on Travis times out with 8 or more nodes if -race flag is present.

* swarm/network: smaller node count for discovery tests with -race

TestDiscoveryPersistenceSimulationSimAdapters failed on Travis with
`-race` flag present. The failure was due to extensive memory usage,
coming from the CGO runtime. Using a smaller node count resolves the
issue.

=== RUN   TestDiscoveryPersistenceSimulationSimAdapter
==7227==ERROR: ThreadSanitizer failed to allocate 0x80000 (524288) bytes of clock allocator (error code: 12)
FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:6976 "((0 && "unable to mmap")) != (0)" (0x0, 0x0)
FAIL    github.com/ethereum/go-ethereum/swarm/network/simulations/discovery     804.826s

* swarm/network: run TestFileRetrieval with fewer nodes with -race

Otherwise we get a failure due to extensive memory usage, as the CGO
runtime cannot allocate more bytes.

=== RUN   TestFileRetrieval
==7366==ERROR: ThreadSanitizer failed to allocate 0x80000 (524288) bytes of clock allocator (error code: 12)
FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:6976 "((0 && "unable to mmap")) != (0)" (0x0, 0x0)
FAIL	github.com/ethereum/go-ethereum/swarm/network/stream	155.165s

* swarm/network: run TestRetrieval with fewer nodes with -race

Otherwise we get a failure due to extensive memory usage, as the CGO
runtime cannot allocate more bytes ("ThreadSanitizer failed to
allocate").

* swarm/network: skip flaky TestGetSubscriptionsRPC on Travis w/ -race

Test fails a lot with something like:
 streamer_test.go:1332: Real subscriptions and expected amount don't match; real: 0, expected: 20

* swarm/storage: skip TestDB_SubscribePull* tests on Travis w/ -race

Travis just hangs...

ok  	github.com/ethereum/go-ethereum/swarm/storage/feed/lookup	1.307s
keepalive
keepalive
keepalive

or panics after a while.

Without these tests the race detector job is now stable. Let's
invetigate these tests in a separate issue:
https://github.com/ethersphere/go-ethereum/issues/1245
2019-02-20 22:57:42 +01:00
lash
d36e974ba3 swarm/network: Keep span across roundtrip (#19140)
* swarm/newtork: WIP Span request span until delivery and put

* swarm/storage: Introduce new trace across single fetcher lifespan

* swarm/network: Put span ids for sendpriority in context value

* swarm: Add global span store in tracing

* swarm/tracing: Add context key constants

* swarm/tracing: Add comments

* swarm/storage: Remove redundant fix for filestore

* swarm/tracing: Elaborate constants comments

* swarm/network, swarm/storage, swarm:tracing: Minor cleanup
2019-02-20 14:50:37 +01:00
lash
460d206f30 swarm/network: Use actual remote peer ip in underlay (#19137)
* swarm/network: Logline to see handshake addr

* swarm/network: Replace remote ip in handshake uaddr

* swarm/network: Add test for enode uaddr rewrite method

* swarm/network: Remove redundance pointer return from sanitize

* swarm/network: Obeying the linting machine

* swarm/network: Add panic comment

(travis trigger take 1)
2019-02-20 14:46:00 +01:00
Janoš Guljaš
ba2dfa5ce4 swarm/network/stream: fix a goroutine leak in Registry (#19139)
* swarm/network/stream: fix a goroutine leak in Registry

* swarm/network, swamr/network/stream: Kademlia close addr count and depth change chans

* swarm/network/stream: rename close channel to quit

* swarm/network/stream: fix sync between NewRegistry goroutine and Close method
2019-02-20 14:45:25 +01:00
Ferenc Szabo
50b872bf05 p2p, swarm: fix node up races by granular locking (#18976)
* swarm/network: DRY out repeated giga comment

I not necessarily agree with the way we wait for event propagation.
But I truly disagree with having duplicated giga comments.

* p2p/simulations: encapsulate Node.Up field so we avoid data races

The Node.Up field was accessed concurrently without "proper" locking.
There was a lock on Network and that was used sometimes to access
the  field. Other times the locking was missed and we had
a data race.

For example: https://github.com/ethereum/go-ethereum/pull/18464
The case above was solved, but there were still intermittent/hard to
reproduce races. So let's solve the issue permanently.

resolves: ethersphere/go-ethereum#1146

* p2p/simulations: fix unmarshal of simulations.Node

Making Node.Up field private in 13292ee897e345045fbfab3bda23a77589a271c1
broke TestHTTPNetwork and TestHTTPSnapshot. Because the default
UnmarshalJSON does not handle unexported fields.

Important: The fix is partial and not proper to my taste. But I cut
scope as I think the fix may require a change to the current
serialization format. New ticket:
https://github.com/ethersphere/go-ethereum/issues/1177

* p2p/simulations: Add a sanity test case for Node.Config UnmarshalJSON

* p2p/simulations: revert back to defer Unlock() pattern for Network

It's a good patten to call `defer Unlock()` right after `Lock()` so
(new) error cases won't miss to unlock. Let's get back to that pattern.

The patten was abandoned in 85a79b3ad3,
while fixing a data race. That data race does not exist anymore,
since the Node.Up field got hidden behind its own lock.

* p2p/simulations: consistent naming for test providers Node.UnmarshalJSON

* p2p/simulations: remove JSON annotation from private fields of Node

As unexported fields are not serialized.

* p2p/simulations: fix deadlock in Network.GetRandomDownNode()

Problem: GetRandomDownNode() locks -> getDownNodeIDs() ->
GetNodes() tries to lock -> deadlock

On Network type, unexported functions must assume that `net.lock`
is already acquired and should not call exported functions which
might try to lock again.

* p2p/simulations: ensure method conformity for Network

Connect* methods were moved to p2p/simulations.Network from
swarm/network/simulation. However these new methods did not follow
the pattern of Network methods, i.e., all exported method locks
the whole Network either for read or write.

* p2p/simulations: fix deadlock during network shutdown

`TestDiscoveryPersistenceSimulationSimAdapter` often got into deadlock.
The execution was stuck on two locks, i.e, `Kademlia.lock` and
`p2p/simulations.Network.lock`. Usually the test got stuck once in each
20 executions with high confidence.

`Kademlia` was stuck in `Kademlia.EachAddr()` and `Network` in
`Network.Stop()`.

Solution: in `Network.Stop()` `net.lock` must be released before
calling `node.Stop()` as stopping a node (somehow - I did not find
the exact code path) causes `Network.InitConn()` to be called from
`Kademlia.SuggestPeer()` and that blocks on `net.lock`.

Related ticket: https://github.com/ethersphere/go-ethereum/issues/1223

* swarm/state: simplify if statement in DBStore.Put()

* p2p/simulations: remove faulty godoc from private function

The comment started with the wrong method name.

The method is simple and self explanatory. Also, it's private.
=> Let's just remove the comment.
2019-02-18 07:38:14 +01:00
holisticode
2af24724dd swarm/network: Saturation check for healthy networks (#19071)
* swarm/network: new saturation for  implementation

* swarm/network: re-added saturation func in Kademlia as it is used elsewhere

* swarm/network: saturation with higher MinBinSize

* swarm/network: PeersPerBin with depth check

* swarm/network: edited tests to pass new saturated check

* swarm/network: minor fix saturated check

* swarm/network/simulations/discovery: fixed renamed RPC call

* swarm/network: renamed to isSaturated and returns bool

* swarm/network: early depth check
2019-02-14 19:01:50 +01:00
Elad
3ee09ba035 swarm/storage/netstore: add fetcher cancellation on shutdown (#19049)
swarm/network/stream: remove netstore internal wg
swarm/network/stream: run individual tests with t.Run
2019-02-14 07:51:57 +01:00
Janoš Guljaš
3fd6db2bf6 swarm: fix network/stream data races (#19051)
* swarm/network/stream: newStreamerTester cleanup only if err is nil

* swarm/network/stream: raise newStreamerTester waitForPeers timeout

* swarm/network/stream: fix data races in GetPeerSubscriptions

* swarm/storage: prevent data race on LDBStore.batchesC

https://github.com/ethersphere/go-ethereum/issues/1198#issuecomment-461775049

* swarm/network/stream: fix TestGetSubscriptionsRPC data race

https://github.com/ethersphere/go-ethereum/issues/1198#issuecomment-461768477

* swarm/network/stream: correctly use Simulation.Run callback

https://github.com/ethersphere/go-ethereum/issues/1198#issuecomment-461783804

* swarm/network: protect addrCountC in Kademlia.AddrCountC function

https://github.com/ethersphere/go-ethereum/issues/1198#issuecomment-462273444

* p2p/simulations: fix a deadlock calling getRandomNode with lock

https://github.com/ethersphere/go-ethereum/issues/1198#issuecomment-462317407

* swarm/network/stream: terminate disconnect goruotines in tests

* swarm/network/stream: reduce memory consumption when testing data races

* swarm/network/stream: add watchDisconnections helper function

* swarm/network/stream: add concurrent counter for tests

* swarm/network/stream: rename race/norace test files and use const

* swarm/network/stream: remove watchSim and its panic

* swarm/network/stream: pass context in watchDisconnections

* swarm/network/stream: add concurrent safe bool for watchDisconnections

* swarm/storage: fix LDBStore.batchesC data race by not closing it
2019-02-13 13:03:23 +01:00
Ferenc Szabo
27e3f96819 swarm: CI race detector test adjustments (#19017) 2019-02-08 17:07:11 +01:00
lash
0c10d37606 swarm/network, swarm/storage: Preserve opentracing contexts (#19022) 2019-02-08 16:57:48 +01:00
holisticode
41597c2856 swarm: Debug API and HasChunks() API endpoint (#18980) 2019-02-07 15:49:19 +01:00
Ferenc Szabo
1c3aa8d9b1 swarm/storage: fix test timeout with -race by increasing mget timeout 2019-02-05 14:34:34 +01:00
Anton Evangelatov
597597e8b2 swarm/network: refactor simulation tests bootstrap (#18975) 2019-02-01 09:58:46 +01:00
holisticode
43e1b7b124 swarm: GetPeerSubscriptions RPC (#18972) 2019-01-30 21:03:08 +01:00
Janoš Guljaš
592bf6a59c swarm: fix flaky delivery tests (#18971) 2019-01-30 14:03:11 +01:00
lash
f9401ae011 swarm/network: Remove extra random peer, connect test sanity, comments (#18964) 2019-01-30 09:49:58 +01:00
Elad
2abeb35d54 p2p/testing, swarm: remove unused testing.T in protocol tester (#18500) 2019-01-24 17:23:34 +01:00
gluk256
ad13d2d407 swarm/version: commit version added (#18510) 2019-01-24 12:35:10 +01:00
Anton Evangelatov
bbd120354a
swarm: bootnode-mode, new bootnodes and no p2p package discovery (#18498) 2019-01-24 12:02:18 +01:00
Viktor Trón
15b9b39e6c
swarm/network: unskip tests previously skipped due to suggestPeer issues (#18477) 2019-01-19 08:12:57 +01:00
Ferenc Szabo
19bfcbf911 swarm/network: fix data race in fetcher_test.go (#18469) 2019-01-17 16:45:36 +01:00
Ferenc Szabo
4f8ec44565 swarm/network: fix data race in stream.(*Peer).handleOfferedHashesMsg() (#18468)
* swarm/network: fix data race in stream.(*Peer).handleOfferedHashesMsg()

handleOfferedHashesMsg() contained a data race:
- read => in a goroutine, call to c.batchDone()
- write => in the main thread, write to c.sessionAt

c.batchDone() contained a call to c.AddInterval(). Client was a value
receiver for AddInterval. So on c.AddInterval() call the whole client
struct got copied (read) while one of its field was modified in
handleOfferedHashesMsg() (write).

fixes ethersphere/go-ethereum#1086

* swarm/network: simplify some trivial statements
2019-01-17 14:44:29 +01:00
Elad
81e26d5a48 swarm/network: fix data race warning on TestBzzHandshakeLightNode (#18459) 2019-01-17 11:38:23 +01:00
Viktor Trón
bcb2594151
swarm/network: rewrite of peer suggestion engine, fix skipped tests (#18404)
* swarm/network: fix skipped tests related to suggestPeer

* swarm/network: rename depth to radius

* swarm/network: uncomment assertHealth and improve comments

* swarm/network: remove commented code

* swarm/network: kademlia suggestPeer algo correction

* swarm/network: kademlia suggest peer

 * simplify suggest Peer code
 * improve peer suggestion algo
 * add comments
 * kademlia testing improvements
   * assertHealth -> checkHealth (test helper)
   * testSuggestPeer -> checkSuggestPeer (test helper)
   * remove testSuggestPeerBug and TestKademliaCase

* swarm/network: kademlia suggestPeer cleanup, improved comments

* swarm/network: minor comment, discovery test default arg
2019-01-17 07:29:34 +01:00
Elad
34f11e752f cmd/swarm/swarm-snapshot: swarm snapshot generator (#18453)
* cmd/swarm/swarm-snapshot: add binary to create network snapshots

* cmd/swarm/swarm-snapshot: refactor and extend tests

* p2p/simulations: remove unused triggerChecks func and fix linter

* internal/cmdtest: raise the timeout for killing TestCmd

* cmd/swarm/swarm-snapshot: add more comments and other minor adjustments

* cmd/swarm/swarm-snapshot: remove redundant check in createSnapshot

* cmd/swarm/swarm-snapshot: change comment wording

* p2p/simulations: revert Simulation.Run from master

https://github.com/ethersphere/go-ethereum/pull/1077/files#r247078904

* cmd/swarm/swarm-snapshot: address pr comments

* swarm/network/simulations/discovery: removed snapshot write to file

* cmd/swarm/swarm-snapshot, swarm/network/simulations: removed redundant connection event check, fixed lint error
2019-01-16 14:33:02 +01:00
Janoš Guljaš
96c7c18b18 swarm/network: fix data race in TestNetworkID test (#18460) 2019-01-16 12:56:34 +01:00
gluk256
4aeeecfded swarm/pot: each() functions refactored (#18452) 2019-01-15 11:51:33 +01:00
holisticode
88168ff5c5 Stream subscriptions (#18355)
* swarm/network: eachBin now starts at kaddepth for nn

* swarm/network: fix Kademlia.EachBin

* swarm/network: fix kademlia.EachBin

* swarm/network: correct EachBin implementation according to requirements

* swarm/network: less addresses simplified tests

* swarm: calc kad depth outside loop in EachBin test

* swarm/network: removed printResults

* swarm/network: cleanup imports

* swarm/network: remove kademlia.EachBin; fix RequestSubscriptions and add unit test

* swarm/network/stream: address PR comments

* swarm/network/stream: package-wide subscriptionFunc

* swarm/network/stream: refactor to kad.EachConn
2019-01-11 15:08:09 +01:00
Ferenc Szabo
2eb838ed97 p2p/simulations: eliminate concept of pivot (#18426) 2019-01-11 10:23:45 +01:00
lash
7240f4d800 swarm/network: Rename minproxbinsize, add as member of simulation (#18408)
* swarm/network: Rename minproxbinsize, add as member of simulation

* swarm/network: Deactivate WaitTillHealthy, unreliable pending suggestpeer
2019-01-10 12:33:51 +01:00
Viktor Trón
6df3e4eeb0
swarm/network: remove isproxbin bool from kad.Each* iterfunc (#18239)
* swarm/network, swarm/pss: remove isproxbin bool from kad.Each* iterfunc

* swarm/network: restore comment and unskip snapshot sync tests
2019-01-10 03:36:19 +01:00
Janoš Guljaš
d70c4faf20 swarm: Fix T.Fatal inside a goroutine in tests (#18409)
* swarm/storage: fix T.Fatal inside a goroutine

* swarm/network/simulation: fix T.Fatal inside a goroutine

* swarm/network/stream: fix T.Fatal inside a goroutine

* swarm/network/simulation: consistent failures in TestPeerEventsTimeout

* swarm/network/simulation: rename sendRunSignal to triggerSimulationRun
2019-01-09 07:05:55 +01:00
holisticode
ae857e74bf swarm, p2p/protocols: Stream accounting (#18337)
* swarm: completed 1st phase of swap accounting

* swarm, p2p/protocols: added stream pricing

* swarm/network/stream: gofmt simplify stream.go

* swarm: fixed review comments

* swarm: used snapshots for swap tests

* swarm: custom retrieve for swap (less cascaded requests at any one time)

* swarm: addressed PR comments

* swarm: log output formatting

* swarm: removed parallelism in swap tests

* swarm: swap tests simplification

* swarm: removed swap_test.go

* swarm/network/stream: added prefix space for comments

* swarm/network/stream: unit test for prices

* swarm/network/stream: don't hardcode price

* swarm/network/stream: fixed invalid price check
2019-01-08 00:59:00 +01:00
Ferenc Szabo
fe03b76ffe A few minor code inspection fixes (#18393)
* swarm/network: fix code inspection problems

- typos
- redundant import alias

* p2p/simulations: fix code inspection problems

- typos
- unused function parameters
- redundant import alias
- code style issue: snake case

* swarm/network: fix unused method parameters inspections
2019-01-06 11:58:57 +01:00
Dave McGregor
33d233d3e1
vendor, crypto, swarm: switch over to upstream sha3 package 2019-01-04 09:26:07 +02:00
Anton Evangelatov
9e9fc87e70
swarm: remove unused/dead code (#18351) 2018-12-23 17:31:32 +01:00
lash
5e4fd8e7db swarm/network: Revised depth and health for Kademlia (#18354)
* swarm/network: Revised depth calculation with tests

* swarm/network: WIP remove redundant "full" function

* swarm/network: WIP peerpot refactor

* swarm/network: Make test methods submethod of peerpot and embed kad

* swarm/network: Remove commented out code

* swarm/network: Rename health test functions

* swarm/network: Too many n's

* swarm/network: Change hive Healthy func to accept addresses

* swarm/network: Add Healthy proxy method for api in hive

* swarm/network: Skip failing test out of scope for PR

* swarm/network: Skip all tests dependent on SuggestPeers

* swarm/network: Remove commented code and useless kad Pof member

* swarm/network: Remove more unused code, add counter on depth test errors

* swarm/network: WIP Create Healthy assertion tests

* swarm/network: Roll back health related methods receiver change

* swarm/network: Hardwire network minproxbinsize in swarm sim

* swarm/network: Rework Health test to strict

Pending add test for saturation
And add test for as many as possible up to saturation

* swarm/network: Skip discovery tests (dependent on SuggestPeer)

* swarm/network: Remove useless minProxBinSize in stream

* swarm/network: Remove unnecessary testing.T param to assert health

* swarm/network: Implement t.Helper() in checkHealth

* swarm/network: Rename check back to assert now that we have helper magic

* swarm/network: Revert WaitTillHealthy change (deferred to nxt PR)

* swarm/network: Kademlia tests GotNN => ConnectNN

* swarm/network: Renames and comments

* swarm/network: Add comments
2018-12-22 06:53:30 +01:00
lash
e1edfe0689 p2p/simulation: Test snapshot correctness and minimal benchmark (#18287)
* p2p/simulation: WIP minimal snapshot test

* p2p/simulation: Add snapshot create, load and verify to snapshot test

* build: add test tag for tests

* p2p/simulations, build: Revert travis change, build test sym always

* p2p/simulations: Add comments, timeout check on additional events

* p2p/simulation: Add benchmark template for minimal peer protocol init

* p2p/simulations: Remove unused code

* p2p/simulation: Correct timer reset

* p2p/simulations: Put snapshot check events in buffer and call blocking

* p2p/simulations: TestSnapshot fail if Load function returns early

* p2p/simulations: TestSnapshot wait for all connections before returning

* p2p/simulation: Revert to before wait for snap load (5e75594)

* p2p/simulations: add "conns after load" subtest to TestSnapshot

and nudge
2018-12-21 06:22:11 +01:00
Javier Peletier
de4265fa02 swarm/network/simulation:commented out unreachable code-avoid vet errors (#18263) 2018-12-18 07:24:59 +01:00
holisticode
90ea542e9e Update visualized snapshot test (#18286)
* swarm/network/stream: fix visualized_snapshot_sync_sim_test

* swarm/network/stream: updated visualized snapshot-test;data in p2p event

* swarm/network/stream: cleanup visualized snapshot sync test

* swarm/network/stream: re-enable t.Skip for visualized test

* swarm/network/stream: addressed PR comments
2018-12-18 07:20:59 +01:00
Elad
472c23a801 p2p/simulation: move connection methods from swarm/network/simulation (#18323) 2018-12-17 12:19:01 +01:00
lash
dd98d1da94 swarm/network: Correct ambiguity in compared addresses (#18251) 2018-12-10 14:56:01 +02:00
Janoš Guljaš
661809714e swarm: snapshot load improvement (#18220)
* swarm/network: Hive - do not notify peer if discovery is disabled

* p2p/simulations: validate all connections on loading a snapshot

* p2p/simulations: track all connections in on snapshot loading

* p2p/simulations: add snapshotLoadTimeout variable

* p2p/simulations: ignore control events in snapshot load

* p2p/simulations: simplify event loop synchronization

* p2p/simulations: return already connected error from Load function

* p2p/simulations: log warning on snapshot loading disconnection
2018-12-07 06:51:40 +01:00
holisticode
b98d2e9a1c swarm/network/stream: Debug log instead of Warn for retrieval failure (#18246) 2018-12-04 18:29:51 +01:00
holisticode
695a5cce1e Increase bzz version (#18184)
* swarm/network/stream/: added stream protocol version match tests

* Increase BZZ version due to streamer version change; version tests

* swarm/network: increased hive and test protocol version
2018-11-26 19:34:40 +01:00
lash
1cd007ecae swarm/network: Correct neighborhood depth (#18066) 2018-11-26 17:13:59 +01:00
lash
197d609b9a swarm/pss: Message handler refactor (#18169) 2018-11-26 13:52:04 +01:00
Janoš Guljaš
93854bbad4 swarm/network/simulation: fix New function for-loop scope (#18161) 2018-11-26 12:39:38 +01:00
Janoš Guljaš
070caec4bd swarm/network/stream: use swarm/mock/mem as mock global store (#18157) 2018-11-21 20:49:13 +01:00