Tests in CI are flakey #142

Open
opened 2024-01-23 17:45:45 +00:00 by telackey · 12 comments
Member

Here is an example failure for test-rpc:

=== RUN   TestEth_GetFilterChanges_NoTopics
    rpc_test.go:295: 
        	Error Trace:	/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:295
        	            				/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:367
        	Error:      	Expected value not to be nil.
        	Test:       	TestEth_GetFilterChanges_NoTopics
        	Messages:   	transaction failed
--- FAIL: TestEth_GetFilterChanges_NoTopics (16.62s)
=== RUN   TestEth_GetFilterChanges_Topics_AB
    rpc_test.go:454: 
        	Error Trace:	/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:454
        	Error:      	Not equal: 
        	            	expected: 1
        	            	actual  : 2
        	Test:       	TestEth_GetFilterChanges_Topics_AB
--- FAIL: TestEth_GetFilterChanges_Topics_AB (19.72s)

There is nothing new that is broken here, and re-running the test passed without issue.

The test seems to be dependent on some sort of performance of other environmental factor that means sometimes it works and sometimes it does not, depending on what task runner it gets assigned to, what other actions are running on the machine, etc.

Here is an example failure for `test-rpc`: ``` === RUN TestEth_GetFilterChanges_NoTopics rpc_test.go:295: Error Trace: /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:295 /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:367 Error: Expected value not to be nil. Test: TestEth_GetFilterChanges_NoTopics Messages: transaction failed --- FAIL: TestEth_GetFilterChanges_NoTopics (16.62s) === RUN TestEth_GetFilterChanges_Topics_AB rpc_test.go:454: Error Trace: /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:454 Error: Not equal: expected: 1 actual : 2 Test: TestEth_GetFilterChanges_Topics_AB --- FAIL: TestEth_GetFilterChanges_Topics_AB (19.72s) ``` There is nothing new that is broken here, and re-running the test passed without issue. The test seems to be dependent on some sort of performance of other environmental factor that means sometimes it works and sometimes it does not, depending on what task runner it gets assigned to, what other actions are running on the machine, etc.
Author
Member

Another example:

=== RUN   TestEth_GetFilterChanges_Topics_AB
    rpc_test.go:420: 
        	Error Trace:	/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:420
        	            				/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:445
        	Error:      	Expected value not to be nil.
        	Test:       	TestEth_GetFilterChanges_Topics_AB
        	Messages:   	transaction failed
--- FAIL: TestEth_GetFilterChanges_Topics_AB (16.58s)
=== RUN   TestEth_GetFilterChanges_Topics_XB
    rpc_test.go:420: 
        	Error Trace:	/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:420
        	            				/workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:475
        	Error:      	Expected value not to be nil.
        	Test:       	TestEth_GetFilterChanges_Topics_XB
        	Messages:   	transaction failed
--- FAIL: TestEth_GetFilterChanges_Topics_XB (16.35s)
Another example: ``` === RUN TestEth_GetFilterChanges_Topics_AB rpc_test.go:420: Error Trace: /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:420 /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:445 Error: Expected value not to be nil. Test: TestEth_GetFilterChanges_Topics_AB Messages: transaction failed --- FAIL: TestEth_GetFilterChanges_Topics_AB (16.58s) === RUN TestEth_GetFilterChanges_Topics_XB rpc_test.go:420: Error Trace: /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:420 /workspace/cerc-io/laconicd/tests/rpc/rpc_test.go:475 Error: Expected value not to be nil. Test: TestEth_GetFilterChanges_Topics_XB Messages: transaction failed --- FAIL: TestEth_GetFilterChanges_Topics_XB (16.35s) ```
Owner

Can we get the test time for the passed case?

Can we get the test time for the passed case?
Author
Member

SDK tests:

    github.com/tendermint/tendermint/mempool/v0.(*CListMempool).CheckTx
    	github.com/tendermint/tendermint@v0.34.24/mempool/v0/clist_mempool.go:254
    account sequence mismatch, expected 8, got 7: incorrect account sequence
SDK tests: ``` github.com/tendermint/tendermint/mempool/v0.(*CListMempool).CheckTx github.com/tendermint/tendermint@v0.34.24/mempool/v0/clist_mempool.go:254 account sequence mismatch, expected 8, got 7: incorrect account sequence ```
Owner

Some parallelism thing?

Some parallelism thing?
Author
Member

Can we get the test time for the passed case?

=== RUN   TestEth_GetFilterChanges_NoTopics
--- PASS: TestEth_GetFilterChanges_NoTopics (6.66s)
=== RUN   TestEth_GetFilterChanges_Topics_AB
--- PASS: TestEth_GetFilterChanges_Topics_AB (9.68s)
=== RUN   TestEth_GetFilterChanges_Topics_XB
--- PASS: TestEth_GetFilterChanges_Topics_XB (9.68s)

With an overall runtime of 4m23s

> Can we get the test time for the passed case? ``` === RUN TestEth_GetFilterChanges_NoTopics --- PASS: TestEth_GetFilterChanges_NoTopics (6.66s) === RUN TestEth_GetFilterChanges_Topics_AB --- PASS: TestEth_GetFilterChanges_Topics_AB (9.68s) === RUN TestEth_GetFilterChanges_Topics_XB --- PASS: TestEth_GetFilterChanges_Topics_XB (9.68s) ``` With an overall runtime of 4m23s
Owner

Definitely shorter times than the failure case. Is there some global timeout we can set?

Definitely shorter times than the failure case. Is there some global timeout we can set?
Owner

Also : what's it doing for 20 seconds??

Also : what's it doing for 20 seconds??
Owner

Another idea: could we just try a faster runner and see if that fixes the problem?

Another idea: could we just try a faster runner and see if that fixes the problem?
Author
Member

I'm all for trying the that, we just need to know where to put it.

I'm all for trying the that, we just need to know where to put it.
Owner

What does it entail? Could we run it temporarily on one of the big servers?

What does it entail? Could we run it temporarily on one of the big servers?
Author
Member

It is fairly simple to spin up one, and we'd need to give it some sort of unique tag and alter the workflows accordingly.

That would also have the side-effect of queueing all the tasks on the same runner to run sequentially, which might improve performance of individual tests vs the possibility of multiple jobs running on the same host machine at the same time.

I'm not sure how that works long term, because it is pretty inefficient, but it might be useful for diagnostic purposes.

It is fairly simple to spin up one, and we'd need to give it some sort of unique tag and alter the workflows accordingly. That would also have the side-effect of queueing all the tasks on the same runner to run sequentially, which might improve performance of individual tests vs the possibility of multiple jobs running on the same host machine at the same time. I'm not sure how that works long term, because it is pretty inefficient, but it might be useful for diagnostic purposes.
Owner

My thinking was to just try it initially and see if it makes the tests reliable. Then we can think about next steps.

My thinking was to just try it initially and see if it makes the tests reliable. Then we can think about next steps.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/laconicd-deprecated#142
No description provided.