plugeth

Author	SHA1	Message	Date
Ferenc Szabo	e38b227ce6	Ci race detector handle failing tests (#19143 ) * swarm/storage: increase mget timeout in common_test.go TestDbStoreCorrect_1k sometimes timed out with -race on Travis. --- FAIL: TestDbStoreCorrect_1k (24.63s) common_test.go:194: testStore failed: timed out after 10s * swarm: remove unused vars from TestSnapshotSyncWithServer nodeCount and chunkCount is returned from setupSim and those values we use. * swarm: move race/norace helpers from stream to testutil As we will need to use the flag in other packages, too. * swarm: refactor TestSwarmNetwork case Extract long running test cases for better visibility. * swarm/network: skip TestSyncingViaGlobalSync with -race As panics on Travis. panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7e351b] * swarm: run TestSwarmNetwork with fewer nodes with -race As otherwise we always get test failure with `network_test.go:374: context deadline exceeded` even with raised `Timeout`. * swarm/network: run TestDeliveryFromNodes with fewer nodes with -race Test on Travis times out with 8 or more nodes if -race flag is present. * swarm/network: smaller node count for discovery tests with -race TestDiscoveryPersistenceSimulationSimAdapters failed on Travis with `-race` flag present. The failure was due to extensive memory usage, coming from the CGO runtime. Using a smaller node count resolves the issue. === RUN TestDiscoveryPersistenceSimulationSimAdapter ==7227==ERROR: ThreadSanitizer failed to allocate 0x80000 (524288) bytes of clock allocator (error code: 12) FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:6976 "((0 && "unable to mmap")) != (0)" (0x0, 0x0) FAIL github.com/ethereum/go-ethereum/swarm/network/simulations/discovery 804.826s * swarm/network: run TestFileRetrieval with fewer nodes with -race Otherwise we get a failure due to extensive memory usage, as the CGO runtime cannot allocate more bytes. === RUN TestFileRetrieval ==7366==ERROR: ThreadSanitizer failed to allocate 0x80000 (524288) bytes of clock allocator (error code: 12) FATAL: ThreadSanitizer CHECK failed: ./gotsan.cc:6976 "((0 && "unable to mmap")) != (0)" (0x0, 0x0) FAIL github.com/ethereum/go-ethereum/swarm/network/stream 155.165s * swarm/network: run TestRetrieval with fewer nodes with -race Otherwise we get a failure due to extensive memory usage, as the CGO runtime cannot allocate more bytes ("ThreadSanitizer failed to allocate"). * swarm/network: skip flaky TestGetSubscriptionsRPC on Travis w/ -race Test fails a lot with something like: streamer_test.go:1332: Real subscriptions and expected amount don't match; real: 0, expected: 20 * swarm/storage: skip TestDB_SubscribePull* tests on Travis w/ -race Travis just hangs... ok github.com/ethereum/go-ethereum/swarm/storage/feed/lookup 1.307s keepalive keepalive keepalive or panics after a while. Without these tests the race detector job is now stable. Let's invetigate these tests in a separate issue: https://github.com/ethersphere/go-ethereum/issues/1245	2019-02-20 22:57:42 +01:00
Ferenc Szabo	50b872bf05	p2p, swarm: fix node up races by granular locking (#18976 ) * swarm/network: DRY out repeated giga comment I not necessarily agree with the way we wait for event propagation. But I truly disagree with having duplicated giga comments. * p2p/simulations: encapsulate Node.Up field so we avoid data races The Node.Up field was accessed concurrently without "proper" locking. There was a lock on Network and that was used sometimes to access the field. Other times the locking was missed and we had a data race. For example: https://github.com/ethereum/go-ethereum/pull/18464 The case above was solved, but there were still intermittent/hard to reproduce races. So let's solve the issue permanently. resolves: ethersphere/go-ethereum#1146 * p2p/simulations: fix unmarshal of simulations.Node Making Node.Up field private in 13292ee897e345045fbfab3bda23a77589a271c1 broke TestHTTPNetwork and TestHTTPSnapshot. Because the default UnmarshalJSON does not handle unexported fields. Important: The fix is partial and not proper to my taste. But I cut scope as I think the fix may require a change to the current serialization format. New ticket: https://github.com/ethersphere/go-ethereum/issues/1177 * p2p/simulations: Add a sanity test case for Node.Config UnmarshalJSON * p2p/simulations: revert back to defer Unlock() pattern for Network It's a good patten to call `defer Unlock()` right after `Lock()` so (new) error cases won't miss to unlock. Let's get back to that pattern. The patten was abandoned in `85a79b3ad3`, while fixing a data race. That data race does not exist anymore, since the Node.Up field got hidden behind its own lock. * p2p/simulations: consistent naming for test providers Node.UnmarshalJSON * p2p/simulations: remove JSON annotation from private fields of Node As unexported fields are not serialized. * p2p/simulations: fix deadlock in Network.GetRandomDownNode() Problem: GetRandomDownNode() locks -> getDownNodeIDs() -> GetNodes() tries to lock -> deadlock On Network type, unexported functions must assume that `net.lock` is already acquired and should not call exported functions which might try to lock again. * p2p/simulations: ensure method conformity for Network Connect* methods were moved to p2p/simulations.Network from swarm/network/simulation. However these new methods did not follow the pattern of Network methods, i.e., all exported method locks the whole Network either for read or write. * p2p/simulations: fix deadlock during network shutdown `TestDiscoveryPersistenceSimulationSimAdapter` often got into deadlock. The execution was stuck on two locks, i.e, `Kademlia.lock` and `p2p/simulations.Network.lock`. Usually the test got stuck once in each 20 executions with high confidence. `Kademlia` was stuck in `Kademlia.EachAddr()` and `Network` in `Network.Stop()`. Solution: in `Network.Stop()` `net.lock` must be released before calling `node.Stop()` as stopping a node (somehow - I did not find the exact code path) causes `Network.InitConn()` to be called from `Kademlia.SuggestPeer()` and that blocks on `net.lock`. Related ticket: https://github.com/ethersphere/go-ethereum/issues/1223 * swarm/state: simplify if statement in DBStore.Put() * p2p/simulations: remove faulty godoc from private function The comment started with the wrong method name. The method is simple and self explanatory. Also, it's private. => Let's just remove the comment.	2019-02-18 07:38:14 +01:00
holisticode	2af24724dd	swarm/network: Saturation check for healthy networks (#19071 ) * swarm/network: new saturation for implementation * swarm/network: re-added saturation func in Kademlia as it is used elsewhere * swarm/network: saturation with higher MinBinSize * swarm/network: PeersPerBin with depth check * swarm/network: edited tests to pass new saturated check * swarm/network: minor fix saturated check * swarm/network/simulations/discovery: fixed renamed RPC call * swarm/network: renamed to isSaturated and returns bool * swarm/network: early depth check	2019-02-14 19:01:50 +01:00
Ferenc Szabo	27e3f96819	swarm: CI race detector test adjustments (#19017 )	2019-02-08 17:07:11 +01:00
Ferenc Szabo	1c3aa8d9b1	swarm/storage: fix test timeout with -race by increasing mget timeout	2019-02-05 14:34:34 +01:00
Viktor Trón	15b9b39e6c	swarm/network: unskip tests previously skipped due to suggestPeer issues (#18477 )	2019-01-19 08:12:57 +01:00
Viktor Trón	bcb2594151	swarm/network: rewrite of peer suggestion engine, fix skipped tests (#18404 ) * swarm/network: fix skipped tests related to suggestPeer * swarm/network: rename depth to radius * swarm/network: uncomment assertHealth and improve comments * swarm/network: remove commented code * swarm/network: kademlia suggestPeer algo correction * swarm/network: kademlia suggest peer * simplify suggest Peer code * improve peer suggestion algo * add comments * kademlia testing improvements * assertHealth -> checkHealth (test helper) * testSuggestPeer -> checkSuggestPeer (test helper) * remove testSuggestPeerBug and TestKademliaCase * swarm/network: kademlia suggestPeer cleanup, improved comments * swarm/network: minor comment, discovery test default arg	2019-01-17 07:29:34 +01:00
Elad	34f11e752f	cmd/swarm/swarm-snapshot: swarm snapshot generator (#18453 ) * cmd/swarm/swarm-snapshot: add binary to create network snapshots * cmd/swarm/swarm-snapshot: refactor and extend tests * p2p/simulations: remove unused triggerChecks func and fix linter * internal/cmdtest: raise the timeout for killing TestCmd * cmd/swarm/swarm-snapshot: add more comments and other minor adjustments * cmd/swarm/swarm-snapshot: remove redundant check in createSnapshot * cmd/swarm/swarm-snapshot: change comment wording * p2p/simulations: revert Simulation.Run from master https://github.com/ethersphere/go-ethereum/pull/1077/files#r247078904 * cmd/swarm/swarm-snapshot: address pr comments * swarm/network/simulations/discovery: removed snapshot write to file * cmd/swarm/swarm-snapshot, swarm/network/simulations: removed redundant connection event check, fixed lint error	2019-01-16 14:33:02 +01:00
lash	7240f4d800	swarm/network: Rename minproxbinsize, add as member of simulation (#18408 ) * swarm/network: Rename minproxbinsize, add as member of simulation * swarm/network: Deactivate WaitTillHealthy, unreliable pending suggestpeer	2019-01-10 12:33:51 +01:00
lash	5e4fd8e7db	swarm/network: Revised depth and health for Kademlia (#18354 ) * swarm/network: Revised depth calculation with tests * swarm/network: WIP remove redundant "full" function * swarm/network: WIP peerpot refactor * swarm/network: Make test methods submethod of peerpot and embed kad * swarm/network: Remove commented out code * swarm/network: Rename health test functions * swarm/network: Too many n's * swarm/network: Change hive Healthy func to accept addresses * swarm/network: Add Healthy proxy method for api in hive * swarm/network: Skip failing test out of scope for PR * swarm/network: Skip all tests dependent on SuggestPeers * swarm/network: Remove commented code and useless kad Pof member * swarm/network: Remove more unused code, add counter on depth test errors * swarm/network: WIP Create Healthy assertion tests * swarm/network: Roll back health related methods receiver change * swarm/network: Hardwire network minproxbinsize in swarm sim * swarm/network: Rework Health test to strict Pending add test for saturation And add test for as many as possible up to saturation * swarm/network: Skip discovery tests (dependent on SuggestPeer) * swarm/network: Remove useless minProxBinSize in stream * swarm/network: Remove unnecessary testing.T param to assert health * swarm/network: Implement t.Helper() in checkHealth * swarm/network: Rename check back to assert now that we have helper magic * swarm/network: Revert WaitTillHealthy change (deferred to nxt PR) * swarm/network: Kademlia tests GotNN => ConnectNN * swarm/network: Renames and comments * swarm/network: Add comments	2018-12-22 06:53:30 +01:00
Anton Evangelatov	4c181e4fb9	swarm/state: refactor InmemoryStore (#18143 )	2018-11-21 14:36:56 +01:00
lash	201a0bf181	p2p/simulations, swarm/network: Custom services in snapshot (#17991 ) * p2p/simulations: Add custom services to simnodes + remove sim down conn objs * p2p/simulation, swarm/network: Add selective services to discovery sim * p2p/simulations, swarm/network: Remove useless comments * p2p/simulations, swarm/network: Clean up mess from rebase * p2p/simulation: Add sleep to prevent connect flakiness in http test * p2p/simulations: added concurrent goroutines to prevent sleeps on simulation connect/disconnect * p2p/simulations, swarm/network/simulations: address pr comments * reinstated dummy service * fixed http snapshot test	2018-11-12 14:57:17 +01:00
Felix Lange	dcae0d348b	p2p/simulations: fix a deadlock and clean up adapters (#17891 ) This fixes a rare deadlock with the inproc adapter: - A node is stopped, which acquires Network.lock. - The protocol code being simulated (swarm/network in my case) waits for its goroutines to shut down. - One of those goroutines calls into the simulation to add a peer, which waits for Network.lock. The fix for the deadlock is really simple, just release the lock before stopping the simulation node. Other changes in this PR clean up the exec adapter so it reports node startup errors better and remove the docker adapter because it just adds overhead. In the exec adapter, node information is now posted to a one-shot server. This avoids log parsing and allows reporting startup errors to the simulation host. A small change in package node was needed because simulation nodes use port zero. Node.{HTTP,WS}Endpoint now return the live endpoints after startup by checking the TCP listener.	2018-10-11 20:32:14 +02:00
Felix Lange	30cd5c1854	all: new p2p node representation (#17643 ) Package p2p/enode provides a generalized representation of p2p nodes which can contain arbitrary information in key/value pairs. It is also the new home for the node database. The "v4" identity scheme is also moved here from p2p/enr to remove the dependency on Ethereum crypto from that package. Record signature handling is changed significantly. The identity scheme registry is removed and acceptable schemes must be passed to any method that needs identity. This means records must now be validated explicitly after decoding. The enode API is designed to make signature handling easy and safe: most APIs around the codebase work with enode.Node, which is a wrapper around a valid record. Going from enr.Record to enode.Node requires a valid signature. * p2p/discover: port to p2p/enode This ports the discovery code to the new node representation in p2p/enode. The wire protocol is unchanged, this can be considered a refactoring change. The Kademlia table can now deal with nodes using an arbitrary identity scheme. This requires a few incompatible API changes: - Table.Lookup is not available anymore. It used to take a public key as argument because v4 protocol requires one. Its replacement is LookupRandom. - Table.Resolve takes enode.Node instead of NodeID. This is also for v4 protocol compatibility because nodes cannot be looked up by ID alone. - Types Node and NodeID are gone. Further commits in the series will be fixes all over the the codebase to deal with those removals. p2p: port to p2p/enode and discovery changes This adapts package p2p to the changes in p2p/discover. All uses of discover.Node and discover.NodeID are replaced by their equivalents from p2p/enode. New API is added to retrieve the enode.Node instance of a peer. The behavior of Server.Self with discovery disabled is improved. It now tries much harder to report a working IP address, falling back to 127.0.0.1 if no suitable address can be determined through other means. These changes were needed for tests of other packages later in the series. * p2p/simulations, p2p/testing: port to p2p/enode No surprises here, mostly replacements of discover.Node, discover.NodeID with their new equivalents. The 'interesting' API changes are: - testing.ProtocolSession tracks complete nodes, not just their IDs. - adapters.NodeConfig has a new method to create a complete node. These changes were needed to make swarm tests work. Note that the NodeID change makes the code incompatible with old simulation snapshots. * whisper/whisperv5, whisper/whisperv6: port to p2p/enode This port was easy because whisper uses []byte for node IDs and URL strings in the API. * eth: port to p2p/enode Again, easy to port because eth uses strings for node IDs and doesn't care about node information in any way. * les: port to p2p/enode Apart from replacing discover.NodeID with enode.ID, most changes are in the server pool code. It now deals with complete nodes instead of (Pubkey, IP, Port) triples. The database format is unchanged for now, but we should probably change it to use the node database later. * node: port to p2p/enode This change simply replaces discover.Node and discover.NodeID with their new equivalents. * swarm/network: port to p2p/enode Swarm has its own node address representation, BzzAddr, containing both an overlay address (the hash of a secp256k1 public key) and an underlay address (enode:// URL). There are no changes to the BzzAddr format in this commit, but certain operations such as creating a BzzAddr from a node ID are now impossible because node IDs aren't public keys anymore. Most swarm-related changes in the series remove uses of NewAddrFromNodeID, replacing it with NewAddr which takes a complete node as argument. ToOverlayAddr is removed because we can just use the node ID directly.	2018-09-25 00:59:00 +02:00
Viktor Trón	bfce00385f	Kademlia refactor (#17641 ) * swarm/network: simplify kademlia/hive; rid interfaces * swarm, swarm/network/stream, swarm/netork/simulations,, swarm/pss: adapt to new Kad API * swarm/network: minor changes re review; add missing lock to NeighbourhoodDepthC	2018-09-12 11:24:56 +02:00
ethersphere	e187711c65	swarm: network rewrite merge	2018-06-21 21:10:31 +02:00

16 Commits