lotus

Author	SHA1	Message	Date
Łukasz Magiera	212f5ddb4f	wip FinalizeReplicaUpdate	2022-02-10 17:24:26 -05:00
Aayush Rajasekaran	a3c5fadcc0	feat: sealing: Add ReplicaUpdate work to Resource table	2022-01-25 13:01:05 -05:00
zenground0	d1480c36c0	RemoveData and Decode - Unsealing replica update with sector key works and tested - Sector key generation added and tested	2022-01-14 17:14:32 -05:00
zenground0	93656e65f8	WIP sector storage and integration test	2022-01-14 17:14:32 -05:00
Łukasz Magiera	71329f6c41	Address Scheduler enhancements (#7703 ) review	2021-11-30 20:50:40 +01:00
Łukasz Magiera	f25efecb74	worker: Test resource table overrides	2021-11-30 02:06:58 +01:00
Łukasz Magiera	6d52d8552b	Fix docsgen	2021-11-30 02:06:58 +01:00
Łukasz Magiera	c9a2ff4007	cleanup worker resource overrides	2021-11-30 02:06:58 +01:00
Clint Armstrong	4ef8543128	Permit workers to override resource table In an environment with heterogenious worker nodes, a universal resource table for all workers does not allow effective scheduling of tasks. Some workers may have different proof cache settings, changing the required memory for different tasks. Some workers may have a different count of CPUs per core-complex, changing the max parallelism of PC1. This change allows workers to customize these parameters with environment variables. A worker could set the environment variable PC1_MIN_MEMORY for example to customize the minimum memory requirement for PC1 tasks. If no environment variables are specified, the resource table on the miner is used, except for PC1 parallelism. If PC1_MAX_PARALLELISM is not specified, and FIL_PROOFS_USE_MULTICORE_SDR is set, PC1_MAX_PARALLELSIM will automatically be set to FIL_PROOFS_MULTICORE_SDR_PRODUCERS + 1.	2021-11-30 02:06:58 +01:00
Clint Armstrong	93e4656a27	Use a float to represent GPU utilization Before this change workers can only be allocated one GPU task, regardless of how much of the GPU resources that task uses, or how many GPUs are in the system. This makes GPUUtilization a float which can represent that a task needs a portion, or multiple GPUs. GPUs are accounted for like RAM and CPUs so that workers with more GPUs can be allocated more tasks. A known issue is that PC2 cannot use multiple GPUs. And even if the worker has multiple GPUs and is allocated multiple PC2 tasks, those tasks will only run on the first GPU. This could result in unexpected behavior when a worker with multiple GPUs is assigned multiple PC2 tasks. But this should not suprise any existing users who upgrade, as any existing users who run workers with multiple GPUs should already know this and be running a worker per GPU for PC2. But now those users have the freedom to customize the GPU utilization of PC2 to be less than one and effectively run multiple PC2 processes in a single worker. C2 is capable of utilizing multiple GPUs, and now workers can be customized for C2 accordingly.	2021-11-30 02:06:58 +01:00
Clint Armstrong	c4f46171ae	Report memory used and swap used in worker res Attempting to report "memory used by other processes" in the MemReserved field fails to take into account the fact that the system's memory used includes memory used by ongoing tasks. To properly account for this, worker should report the memory and swap used, then the scheduler that is aware of the memory requirements for a task can determine if there is sufficient memory available for a task.	2021-11-30 02:06:58 +01:00
Łukasz Magiera	261238e157	Show prepared tasks in sealing jobs	2021-10-18 18:44:56 +02:00
Aarsh Shah	d7076778e2	integrate DAG store and CARv2 in deal-making (#6671 ) This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com> Co-authored-by: Raúl Kripalani <raul@protocol.ai> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>	2021-08-16 23:34:32 +01:00
Raúl Kripalani	f3b6f8de1a	add ability to ignore worker resources when scheduling.	2021-06-21 20:08:18 +01:00
aarshkshah1992	ad4b182bfe	remove read task type and run gen and docsgen	2021-06-07 15:03:06 +05:30
aarshkshah1992	670835fca0	bypass task scheduler for reading unsealed pieces	2021-06-07 15:02:04 +05:30
Anton Evangelatov	e07438417c	consider storiface.PathStorage when calculating storage requirements	2021-05-11 13:19:26 +02:00
Łukasz Magiera	26399dba70	Update markets, cbor-gen with soft map decoding	2021-02-19 20:11:43 +01:00
Łukasz Magiera	289ef910a0	fix imports, docsgen	2020-12-02 00:39:55 +01:00
Łukasz Magiera	95eaf13b5a	sectorstorage: Fix tests	2020-12-02 00:36:32 +01:00
Łukasz Magiera	b242d69805	Make storiface.CallError json-friendly	2020-11-17 16:28:41 +01:00
Łukasz Magiera	b8853aa4d5	Add error codes to worker return	2020-11-17 16:17:55 +01:00
Łukasz Magiera	6bea9dd178	Making sealing logic work with multiple seal proof types	2020-11-16 19:03:30 +01:00
Łukasz Magiera	5caa277261	storage: Track abandoned work more correctly	2020-11-09 23:38:20 +01:00
Łukasz Magiera	f819e71d12	storage: Separate returned jobs in jobs cli	2020-11-09 23:13:29 +01:00
Łukasz Magiera	27a9dd3bbb	storage: Track worker hostnames with work	2020-11-09 23:09:04 +01:00
Łukasz Magiera	660236b224	Merge remote-tracking branch 'origin/master' into feat/async-restartable-workers	2020-10-23 23:25:35 +02:00
Łukasz Magiera	8d06cca073	sched: Handle workers using sessions instead of connections	2020-10-18 12:36:06 +02:00
Łukasz Magiera	d817dceb05	Show lost calls in sealing jobs cli	2020-09-23 19:26:35 +02:00
Łukasz Magiera	5e09581256	sectorstorage: get new work tracker to run	2020-09-16 22:33:58 +02:00
Łukasz Magiera	1ebca8f732	more working code	2020-09-14 19:09:01 +02:00
Łukasz Magiera	5f08fe7ead	Merge remote-tracking branch 'origin/master' into feat/async-restartable-workers	2020-09-10 17:30:54 +02:00
Aayush Rajasekaran	39755a294a	Update to specs v0.9.6	2020-09-07 15:48:41 -04:00
Łukasz Magiera	9e6f974f3c	storage: Fix build	2020-09-07 16:12:55 +02:00
Łukasz Magiera	5d73943929	storage: Fix import cycle	2020-09-06 18:54:00 +02:00
Łukasz Magiera	159ce13f5e	Async worker API	2020-09-06 18:47:16 +02:00
Łukasz Magiera	11b11e416b	sectorstorage: Compact assigned windows	2020-08-28 18:26:38 +02:00
Łukasz Magiera	59f554b658	sealing sched: Show waiting tasks assigned to workers in sealing jobs cli	2020-08-27 23:14:33 +02:00
Raúl Kripalani	efdc428d5d	keep storage-fsm (renamed to storage-sealing) and sector-storage in extern.	2020-08-17 14:26:18 +01:00
Raúl Kripalani	3c17cd655e	integrate extern/sector-storage into lotus proper.	2020-08-16 11:09:58 +01:00
Łukasz Magiera	0eaf44eb31	Merge sector-storage subtree	2020-08-10 17:25:46 +02:00

41 Commits