Commit Graph

50 Commits

Author SHA1 Message Date
Łukasz Magiera
7117a8d80d fix lint 2022-05-27 16:15:52 +02:00
Łukasz Magiera
26a0b43116 Merge remote-tracking branch 'origin/master' into feat/worker-task-count-limits 2022-05-27 16:01:32 +02:00
Łukasz Magiera
083c7421ce feat: sched: Worker task count limits for all task types 2022-05-25 16:31:26 +02:00
Łukasz Magiera
b576008e87 sched: Strong preferrence in WorkerSelector 2022-05-23 23:28:55 +02:00
Łukasz Magiera
5ba8bd3b99 sched: Configurable assigners 2022-05-23 22:02:39 +02:00
Łukasz Magiera
588b8ecbca sched: Separate WindowSelector func 2022-05-23 22:02:39 +02:00
Łukasz Magiera
9ac19cb14b feat: sealing: Put scheduler assign logic behind an interface 2022-05-23 22:02:39 +02:00
unknown
c4cfb7a296 scheduling optimization 2022-05-20 17:04:51 +08:00
Łukasz Magiera
80133aaa79 feat: sched: Improve worker assigning logic 2022-04-06 18:24:14 -04:00
Łukasz Magiera
6de4e3d4cd feat: sched: Cache worker tasks 2022-04-06 18:24:14 -04:00
Łukasz Magiera
c9a2ff4007 cleanup worker resource overrides 2021-11-30 02:06:58 +01:00
Clint Armstrong
93e4656a27 Use a float to represent GPU utilization
Before this change workers can only be allocated one GPU task,
regardless of how much of the GPU resources that task uses, or how many
GPUs are in the system.

This makes GPUUtilization a float which can represent that a task needs
a portion, or multiple GPUs. GPUs are accounted for like RAM and CPUs so
that workers with more GPUs can be allocated more tasks.

A known issue is that PC2 cannot use multiple GPUs. And even if the
worker has multiple GPUs and is allocated multiple PC2 tasks, those
tasks will only run on the first GPU.

This could result in unexpected behavior when a worker with multiple
GPUs is assigned multiple PC2 tasks. But this should not suprise any
existing users who upgrade, as any existing users who run workers with
multiple GPUs should already know this and be running a worker per GPU
for PC2. But now those users have the freedom to customize the GPU
utilization of PC2 to be less than one and effectively run multiple PC2
processes in a single worker.

C2 is capable of utilizing multiple GPUs, and now workers can be
customized for C2 accordingly.
2021-11-30 02:06:58 +01:00
Łukasz Magiera
70589e4406 Block work in tracked worker before it is started 2021-10-18 18:44:56 +02:00
Łukasz Magiera
11d738eee0 Track prepared work 2021-10-18 18:44:56 +02:00
Łukasz Magiera
b87142ec8e wip improve scheduling of ready work 2021-10-03 10:38:08 +02:00
Łukasz Magiera
ef03314c6d storagemgr: Cleanup workerLk around worker resources 2021-09-15 16:35:19 +02:00
Raúl Kripalani
59eab2df25 move scheduling filtering logic down. 2021-06-21 20:49:16 +01:00
Raúl Kripalani
f3b6f8de1a add ability to ignore worker resources when scheduling. 2021-06-21 20:08:18 +01:00
Łukasz Magiera
a63ef1dcd5
Merge pull request #4984 from yaohcn/fix-log-warn
fix log format
2020-11-24 18:01:56 +01:00
yaohcn
7c0b6f41d8 fix log format 2020-11-24 19:09:48 +08:00
zgfzgf
b6893b0a3f solve merage problem 2020-11-22 16:15:30 +08:00
Łukasz Magiera
6bea9dd178 Making sealing logic work with multiple seal proof types 2020-11-16 19:03:30 +01:00
Łukasz Magiera
f90a387f96 sched: Print worker UUIDs in shed-diag correctly 2020-10-30 18:32:16 +01:00
Łukasz Magiera
96c5ff7e7f sched: use more letters for variables 2020-10-28 14:23:38 +01:00
Łukasz Magiera
84b567c790 sched: move worker funcs to a separate file 2020-10-28 13:39:28 +01:00
Łukasz Magiera
8d06cca073 sched: Handle workers using sessions instead of connections 2020-10-18 12:36:06 +02:00
Łukasz Magiera
79d2ddf24f Review 2020-09-30 21:18:12 +02:00
zgfzgf
1a7aea1906 modify error 2020-09-25 22:59:21 +08:00
zgfzgf
3207bc4704 optimize trySched 2020-09-25 22:41:29 +08:00
zgfzgf
60e950015c modify for unsafe 2020-09-25 22:13:27 +08:00
Łukasz Magiera
86c222ab58 sectorstorage: fix work tracking 2020-09-23 14:56:50 +02:00
Łukasz Magiera
d9d644b27f sectorstorage: handle restarting manager, test that 2020-09-17 00:35:09 +02:00
Aayush Rajasekaran
39755a294a Update to specs v0.9.6 2020-09-07 15:48:41 -04:00
Łukasz Magiera
5a2b439773 sched: Fix tests 2020-09-02 17:37:19 +02:00
Łukasz Magiera
7fe8580da5 sealing sched: Fix deadlock between worker.wndLk / workersLk 2020-09-02 17:06:48 +02:00
Łukasz Magiera
e14c80360d sealing sched: Factor worker queues into utilization calc 2020-08-31 13:41:34 +02:00
Łukasz Magiera
28ac2fce61 sched: Fix panic in workerCompactWindows 2020-08-29 06:41:19 +02:00
Łukasz Magiera
9d0c8ae3dd sectorstorage: update sched tests for new logic 2020-08-28 21:38:21 +02:00
Łukasz Magiera
4a75e1e4b4 sectorstorage: Don't require tasks within a window to run in order 2020-08-28 19:38:55 +02:00
Łukasz Magiera
11b11e416b sectorstorage: Compact assigned windows 2020-08-28 18:26:38 +02:00
Łukasz Magiera
6d1682a27e storagefsm: wire up RecoverDealIDs fully 2020-08-28 11:44:15 +02:00
Łukasz Magiera
1097d29213 sealing sched: Call trySched less when there are many tasks 2020-08-28 00:03:42 +02:00
Łukasz Magiera
59d2034cbb sealing sched: Wait a bit for tasks to come in on restart 2020-08-27 23:58:37 +02:00
Łukasz Magiera
f2bd680cc5 gofmt 2020-08-27 23:14:46 +02:00
Łukasz Magiera
59f554b658 sealing sched: Show waiting tasks assigned to workers in sealing jobs cli 2020-08-27 23:14:33 +02:00
Łukasz Magiera
d9796cd25c sectorstorage: Make trySched less very slow 2020-08-24 19:16:16 +02:00
whyrusleeping
54862be3ff check that worker referenced by task is actually still there. 2020-08-21 10:33:36 -07:00
Raúl Kripalani
efdc428d5d keep storage-fsm (renamed to storage-sealing) and sector-storage in extern. 2020-08-17 14:26:18 +01:00
Raúl Kripalani
3c17cd655e integrate extern/sector-storage into lotus proper. 2020-08-16 11:09:58 +01:00
Łukasz Magiera
0eaf44eb31 Merge sector-storage subtree 2020-08-10 17:25:46 +02:00