Tests in CI are flakey #142
Labels
No Label
bug
C:CLI
C:Crypto
C:Encoding
C:Proto
C:Types
dependencies
docker
documentation
duplicate
enhancement
go
good first issue
help wanted
high priority
in progress
invalid
javascript
low priority
medium priority
question
Status: Stale
Type: ADR
Type: Build
Type: CI
Type: Docs
Type: Tests
urgent
wontfix
Copied from Github
Kind/Breaking
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cerc-io/laconicd-deprecated#142
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Here is an example failure for
test-rpc
:There is nothing new that is broken here, and re-running the test passed without issue.
The test seems to be dependent on some sort of performance of other environmental factor that means sometimes it works and sometimes it does not, depending on what task runner it gets assigned to, what other actions are running on the machine, etc.
Another example:
Can we get the test time for the passed case?
SDK tests:
Some parallelism thing?
With an overall runtime of 4m23s
Definitely shorter times than the failure case. Is there some global timeout we can set?
Also : what's it doing for 20 seconds??
Another idea: could we just try a faster runner and see if that fixes the problem?
I'm all for trying the that, we just need to know where to put it.
What does it entail? Could we run it temporarily on one of the big servers?
It is fairly simple to spin up one, and we'd need to give it some sort of unique tag and alter the workflows accordingly.
That would also have the side-effect of queueing all the tasks on the same runner to run sequentially, which might improve performance of individual tests vs the possibility of multiple jobs running on the same host machine at the same time.
I'm not sure how that works long term, because it is pretty inefficient, but it might be useful for diagnostic purposes.
My thinking was to just try it initially and see if it makes the tests reliable. Then we can think about next steps.