Commit Graph

13 Commits

Author SHA1 Message Date
Hector Sanjuan
1bb698619c Ranged-export: Remove CachingBlockstore
The improvements in the range-export code lead to avoid reading most blocks
twice, as well as to allowing some blocks to be written to disk multiple times.

The cache hit-rate went down from being close to 50% to a maximum of 12% at
the very end of the export. The reason is that most CIDs are never read twice
since they are correctly tracked in the CID set.

These numbers do not support the maintenance of the CachingBlockstore
code. Additional testing shows that removing it has similar memory-usage
behaviour and about 5 minute-faster execution (around 10%).

Less code to maintain and less options to mess up with.
2023-02-14 21:08:10 +01:00
Hector Sanjuan
fa93c23813 Chain ranged export: rework and address current shortcomings
This commit moderately refactors the ranged export code. It addresses several
problems:
  * Code does not finish cleanly and things hang on ctrl-c
  * Same block is read multiple times in a row (artificially increasing cached
    blockstore metrics to 50%)
  * It is unclear whether there are additional races (a single worker quits
    when reaching height 0)
  * CARs produced have duplicated blocks (~400k for an 80M-blocks CAR or
    so). Some blocks appear up to 5 times.
  * Using pointers for tasks where it is not necessary.

The changes:

  * Use a FIFO instead of stack: simpler implementation as its own type. This
has not proven to be much more memory-friendly, but it has not made things
worse either.
  * We avoid a probably not small amount of allocations by not using
    unnecessary pointers.
  * Fix duplicated blocks by atomically checking+adding to CID set.
  * Context-termination now works correctly. Worker lifetime is correctly tracked and all channels
  are closed, avoiding any memory leaks and deadlocks.
  * We ensure all work is finished before finishing, something that might have
  been broken in some edge cases previously. In practice, we would not have
  seen this except perhaps in very early snapshots close to genesis.

Initial testing shows the code is currently about 5% faster. Resulting
snapshots do not have duplicates so they are a bit smaller. We have manually
verified that no CID is lost versus previous results, with both old and recent
snapshots.
2023-02-14 21:08:10 +01:00
frrist
21efd481d8 First efficient ranged-export implementation by @frisst
This first commit contains the first and second implementation stabs (after
primary review by @hsanjuan), using a stack for task buffering.

Known issues: ctrl-c (context cancellation) results in the export code getting
deadlocked. Duplicate blocks in exports. Duplicate block reads from store.

Original commit messages:

works

works against mainnet and calibnet

feat: add internal export api method

- will hopfully make things faster by not streaming the export over the json rpc api

polish: better file nameing

fix: potential race in marking cids as seen

chore: improve logging

feat: front export with cache

fix: give hector a good channel buffer on this shit

docsgen
2023-02-14 15:41:10 +01:00
Jorropo
f572852d06 chore: all: bump go-libipfs to replace go-block-format
Includes changes from:
- https://github.com/ipfs/go-block-format/pull/37
- https://github.com/ipfs/go-libipfs/pull/58
2023-01-26 17:03:18 +01:00
Geoff Stuart
50f26e9721 Fix testground build 2023-01-20 13:27:04 -05:00
Shrenuj Bansal
ee54a7f3f5
feat: snapshot: Store tipset key cids in chain store during snapshot import ()
* Store tipset key cids in chain store during snapshot import

* make gen

* fix circle ci config

* fix lint

* address comments
2023-01-18 11:22:05 -05:00
Łukasz Magiera
ac8ab3ef9e feat: chain: Faster snapshot imports, zstd imports 2022-11-29 14:10:15 +01:00
Łukasz Magiera
e65fae28de chore: fix imports 2022-06-14 17:00:51 +02:00
Steven Allen
f491e39f22 fix: vm: support raw blocks in chain export
We need this for NV16 to include code in chain snapshots.

NOTE: I've also checked the splitstore, and we appear to be doing the
right thing there already.
2022-05-20 18:56:30 -04:00
Łukasz Magiera
84dbb229b6 shed: blockstore/vlog to car export cmds 2022-03-09 10:21:36 +01:00
vyzo
dd327f0b22 plumb more contexts 2021-12-17 11:42:09 +02:00
Aayush Rajasekaran
dfb65ed89f Plumb contexts through 2021-12-11 17:04:00 -05:00
Łukasz Magiera
6ba42e3e6b chainstore: refactor store.go into more subfiles 2021-07-27 14:41:36 +02:00