vyzo
1f2b604c07
RIP tracking store
2021-07-04 18:38:28 +03:00
vyzo
d476a3db2c
BlockstoreIterator trait with implementation for badger
2021-07-04 18:38:28 +03:00
vyzo
68a83500bc
fix bug that turned candidate filtering to dead code
2021-07-04 18:38:28 +03:00
vyzo
00fcf6dd72
add staging cache to bolt tracking store
2021-07-04 18:38:28 +03:00
vyzo
642f0e4740
deal with memory pressure, don't walk under the boundary
2021-07-04 18:38:28 +03:00
vyzo
c5cf8e226b
remove unnecessary code
2021-07-04 18:38:28 +03:00
vyzo
d79e4da7aa
more accurate stats about mark set updates
2021-07-04 18:38:28 +03:00
vyzo
6f58fdcb22
remove vm copy context detection hack
...
stack tracing is slow.
2021-07-04 18:38:28 +03:00
vyzo
2b03316cd9
fix log message
2021-07-04 18:38:28 +03:00
vyzo
184d3802b6
remove dead code
2021-07-04 18:38:28 +03:00
vyzo
228a435ba7
rework tracking logic; do it lazily and far more efficiently
2021-07-04 18:38:28 +03:00
vyzo
9d6cabd18a
if it's not a dag, it's not a block
2021-07-04 18:38:28 +03:00
vyzo
8157f889ce
short-circuit marking walks when encountering a block and more efficient walking
2021-07-04 18:38:28 +03:00
vyzo
736d6a3c19
only treat Has as an implicit write within vm.Copy context
2021-07-04 18:38:28 +03:00
vyzo
39723bbe60
use a single map for tracking pending writes, properly track implicits
2021-07-04 18:38:28 +03:00
vyzo
5834231e58
create the transactional protect filter before walking
2021-07-04 18:38:28 +03:00
vyzo
e4bb4be855
fix some residual purge races
2021-07-04 18:38:28 +03:00
vyzo
68bc5d2291
skip moving cold blocks when running with a noop coldstore
...
it is a noop but it still takes (a lot of) time because it has to read all the cold blocks.
2021-07-04 18:38:28 +03:00
vyzo
b87295db93
bubble up dependent txn ref errors
...
This cause Has to return false if it fails to traverse/protect all links, which would cause
the vm to recompute.
2021-07-04 18:38:28 +03:00
vyzo
637fbf6c5b
fix faulty if/else logic for implicit txn protection
2021-07-04 18:38:28 +03:00
vyzo
9d6bcd7705
avoid clown shoes: only walk links for tracking in implicit writes/refs
2021-07-04 18:38:28 +03:00
vyzo
484dfaebce
reused cidset across all walks when flushing pending writes
2021-07-04 18:38:28 +03:00
vyzo
1d41e1544a
optimize transitive write tracking a bit
2021-07-04 18:38:28 +03:00
vyzo
da00fc66ee
downgrade a couple of logs to warnings
2021-07-04 18:38:28 +03:00
vyzo
4071488ef2
first write, then track
2021-07-04 18:38:28 +03:00
vyzo
bd92c230da
refactor txn reference tracking, do deep marking of DAGs
2021-07-04 18:38:28 +03:00
vyzo
a98a062347
do the dag walk for deep write tracking during flush
...
avoid crawling everything to a halt
2021-07-04 18:38:28 +03:00
vyzo
13a674330f
add pending write check before tracking the object in Has
2021-07-04 18:38:28 +03:00
vyzo
982867317e
transitively track dags from implicit writes in Has
2021-07-04 18:38:28 +03:00
vyzo
4de0cd9fcb
move write log back to flush so that we don't crawl to a halt
2021-07-04 18:38:28 +03:00
vyzo
b3ddaa5f02
fix panic at startup
...
genesis is written (!) before starting the splitstore, so curTs is nil
2021-07-04 18:38:28 +03:00
vyzo
2faa4aa993
debug log writes at track so that we get correct stack traces
2021-07-04 18:38:28 +03:00
vyzo
aeaa59d4b5
move comments about tracking perf issues into a more pertinent place
2021-07-04 18:38:28 +03:00
vyzo
3e8e9273ca
track all writes using async batching, not just implicit ones
2021-07-04 18:38:28 +03:00
vyzo
d0bfe421b5
flush implicit writes at the right time before starting compaction to avoid races
2021-07-04 18:38:28 +03:00
vyzo
7f473f56eb
flush implicit writes before starting compaction
2021-07-04 18:38:28 +03:00
vyzo
a29947d47c
flush implicit writes in all paths in updateWriteEpoch
2021-07-04 18:38:28 +03:00
vyzo
be6cc2c3e6
batch implicit write tracking
...
bolt performance leaves something to be desired; doing a single Put takes 10ms, about the same time
as batching thousands of them.
2021-07-04 18:38:28 +03:00
vyzo
e472cacb3e
add missing return
2021-07-04 18:38:28 +03:00
vyzo
6a3cbea790
treat Has as an implicit Write
...
Rationale: the VM uses the Has check to avoid issuing a duplicate Write in the blockstore.
This means that live objects that would be otherwise written are not actually written, resulting
in the first write epoch being considered the write epoch.
2021-07-04 18:38:28 +03:00
vyzo
f97535d87e
store the hash in map markset
2021-07-04 18:38:28 +03:00
vyzo
90dc274113
better logging for chain walk
2021-07-04 18:38:28 +03:00
vyzo
40f42db7fa
walk tweaks
2021-07-04 18:38:28 +03:00
vyzo
09efed50fd
check for lookback references to block headers in walk
2021-07-04 18:38:28 +03:00
vyzo
7de0771883
count txn live objects explicitly for logging
2021-07-04 18:38:28 +03:00
vyzo
e29b64c5de
check both markset and txn liveset before declaring an object cold
2021-07-04 18:38:28 +03:00
vyzo
4bed3161f0
fix broken purge count log
2021-07-04 18:38:28 +03:00
vyzo
7307eb54dc
cache stack repr computation
2021-07-04 18:38:28 +03:00
vyzo
57e25ae1cd
use succint timetamp in debug logs
2021-07-04 18:38:28 +03:00
vyzo
b2b13bbe89
fix debug panic
2021-07-04 18:38:28 +03:00
vyzo
0b315e97c8
fix index out of range
2021-07-04 18:38:28 +03:00
vyzo
dec61fa333
deduplicate stack logs and optionally trace write stacks
2021-07-04 18:38:28 +03:00
vyzo
7ebef6d838
better log message
2021-07-04 18:38:28 +03:00
vyzo
40ff5bf164
log put errors in splitstore log
2021-07-04 18:38:28 +03:00
vyzo
9fda61abec
fix error check for unreachable cids
2021-07-04 18:38:28 +03:00
vyzo
4a71c68e06
move code around for better readability
2021-07-04 18:38:28 +03:00
vyzo
31497f4bd3
use internal get during walk to avoid blowing the compaction txn
...
otherwise the walk itself precludes purge... duh!
2021-07-04 18:38:28 +03:00
vyzo
6af3a23dd4
use a map for txn protection mark set
2021-07-04 18:38:28 +03:00
vyzo
65ccc99e79
minor tweaks in purge
...
- allocate once
- log purge count
2021-07-04 18:38:28 +03:00
vyzo
cb665d07e0
fix transactional race during compaction
...
It is possible for an object to be written or recreated (and checked with Has)
after the mark completes and during the purge; if this happens we will purge
a live block.
2021-07-04 18:38:28 +03:00
vyzo
50ebaf25aa
don't log read misses before warmup
2021-07-04 18:38:28 +03:00
vyzo
375a1790e7
reset counters after flush
2021-07-04 18:38:28 +03:00
vyzo
b187b5c301
fix lint
2021-07-04 18:38:28 +03:00
vyzo
a53c4e1597
implement debug log
2021-07-04 18:38:28 +03:00
vyzo
fce7b8dc9b
flush move log when cold collection is done
2021-07-04 18:38:28 +03:00
vyzo
fc247e4223
add debug log skeleton
2021-07-04 18:38:28 +03:00
vyzo
0390285c4e
always do full walks, not only when there is a sync gap
2021-07-04 18:38:28 +03:00
vyzo
30dbe4978b
adjust compaction range
2021-07-04 18:38:28 +03:00
vyzo
a21f55919b
CompactionThreshold should be 4 finalities
...
otherwise we'll wear clown shoes with the slack and end up in continuous compaction.
2021-07-04 18:38:28 +03:00
vyzo
79d21489df
fix test
2021-07-04 18:38:28 +03:00
vyzo
a25ac80777
reintroduce compaction slack
2021-07-04 18:38:28 +03:00
vyzo
a178c1fb93
fix test
2021-07-04 18:38:28 +03:00
vyzo
c4d95de987
coalesce back-to-back compactions
...
get rid of the CompactionCold construct, run a single compaction on catch up
2021-07-04 18:38:28 +03:00
vyzo
b7897595eb
augment current epoch by +1
...
to account for off by one conditions
2021-07-04 18:38:28 +03:00
vyzo
933c786421
update write epoch in the background every second
2021-07-04 18:38:28 +03:00
vyzo
66f1630f14
fix lint issue
2021-07-04 18:38:28 +03:00
vyzo
bb17608ae0
track writeEpoch relative to current wall clock time
...
The issue: head change notifications are not emitted until after catching up,
which results in all writes during a catch up period being tracked at the base epoch.
2021-07-04 18:38:28 +03:00
vyzo
421f05eab9
save the warm up epoch only if successful in warming up
2021-07-04 18:38:28 +03:00
vyzo
9b6448518c
refactor warmup to trigger at startup and not wait for sync
2021-07-04 18:38:28 +03:00
vyzo
3fe4261f12
don't attempt compaction while still syncing
2021-07-04 18:38:28 +03:00
vyzo
7b02673620
don't try to visit genesis parent blocks
2021-07-04 18:38:28 +03:00
vyzo
997f2c098b
keep headers hot when running with a noop splitstore
2021-07-04 18:38:28 +03:00
vyzo
7c814cd2e3
refactor genesis state loading code into its own method
2021-07-04 18:38:28 +03:00
vyzo
41573f1fb2
also walk parent message receipts when including messages in the walk
2021-07-04 18:38:28 +03:00
vyzo
fa6481401d
reduce SyncGapTime to 1 minute
...
for maximal safety.
2021-07-04 18:38:28 +03:00
vyzo
fda291b876
fix test
2021-07-04 18:38:28 +03:00
vyzo
d33a44e67f
first visit the cid, then short-circuit non dagcbor objects
2021-07-04 18:38:28 +03:00
vyzo
bdb97d6186
more robust handling of sync gap walks
2021-07-04 18:38:28 +03:00
vyzo
7cf75e667d
keep genesis-linked state hot
2021-07-04 18:38:28 +03:00
vyzo
e9f531b4aa
don't open bolt tracking store with NoSync, it might get corrupted
2021-07-04 18:38:28 +03:00
Raúl Kripalani
b2b7eb2ded
metrics: increment misses in View().
2021-07-04 18:38:28 +03:00
vyzo
3a9b7c592d
mark from current epoch to boundary epoch when necessary
...
this is necessary to avoid wearing clown shoes when the node stays
offline for an extended period of time (more than 1 finality).
Basically it gets quite slow if we do the full 2 finality walk, so we
try to avoid it unless necessary.
The conditions under which a full walk is necessary is if there is a
sync gap (most likely because the node was offline) during which the
tracking of writes is inaccurate because we have not yet delivered the
HeadChange notification. In this case, it is possible to have
actually hot blocks to be tracked before the boundary and fail to mark
them accordingly. So when we detect a sync gap, we do the full walk;
if there is no sync gap, we can just use the much faster boundary
epoch walk.
2021-07-04 18:38:28 +03:00
vyzo
d7ceef104e
decrease CompactionThreshold to 3 finalities
2021-07-04 18:38:28 +03:00
vyzo
e3cbeec6ee
implement chain walking
2021-07-04 18:38:28 +03:00
vyzo
04f2e102a1
kill full splitstore compaction, simplify splitstore configuration
2021-07-04 18:38:28 +03:00
vyzo
4d3c73f4ca
noop blockstore
2021-07-04 18:38:28 +03:00
Steven Allen
f353a794cb
fix: spelling
...
Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
2021-04-28 22:08:37 -07:00
Steven Allen
63db9e1633
fix(splitstore): fix a panic on revert-only head changes
...
Calling, e.g., `lotus chain sethead` on an ancestor tipset won't apply
any new blocks, it'll just revert a bunch. This will lead to HeadChange
calls with no new blocks to apply.
fixes #6125
2021-04-28 20:35:30 -07:00
Peter Rabbitson
a5fd552a0c
CachedBlockstore is not referenced as of 3795cc2bd2
2021-04-06 13:24:49 +02:00
Łukasz Magiera
5f80869fe0
Merge pull request #5794 from filecoin-project/fix/atomic-first
...
fix: make sure atomic 64bit fields are 64bit aligned
2021-03-25 13:29:59 +01:00
Steven Allen
25725110e7
Merge pull request #5792 from filecoin-project/fix/timed-cache-locking
...
fix: avoid holding a lock while calling the View callback
2021-03-12 09:25:00 -08:00
Łukasz Magiera
c69b26cfc6
Merge pull request #5778 from filecoin-project/feat/splitstore-compact-hotstore
...
splitstore: compact hotstore prior to garbage collection
2021-03-12 16:30:05 +01:00
Steven Allen
bba71da401
fix: return buffers after canceling badger operation
...
In theory, Delete/Put could fail. If it does, we'll return the buffers
to the pool before we're really done with them.
In practice, this is almost certainly not an issue as badger shouldn't
_use_ the buffer unless we flush. But I feel slightly safer this way.
2021-03-11 20:30:43 -08:00
Steven Allen
1c490f3fda
fix: make sure atomic 64bit fields are 64bit aligned
...
Otherwise, this won't work on 32bit ARM.
2021-03-11 20:10:39 -08:00
Steven Allen
a888ea0d1f
fix: avoid holding a lock while calling the View callback
...
Interleaved puts/views could get really slow and there's no real reason
to use view under the covers here because the underlying blockstore is
always a "memstore".
2021-03-11 20:03:38 -08:00
vyzo
1b1d3606cd
make linter happy
2021-03-11 13:10:44 +02:00
vyzo
353bb1881f
compact hotstore if it provides the method
2021-03-11 11:45:19 +02:00
vyzo
01ce9b5c44
add Compact to badger blockstore
2021-03-11 11:45:05 +02:00
vyzo
ae6410d02f
use compacting atomic to make the test deterministic
2021-03-09 09:05:36 +02:00
Steven Allen
6d2e8d721d
test: attempt to make the splitstore test deterministic
...
At a minimum, make it thread-safe.
2021-03-08 16:36:25 -08:00
vyzo
90741da019
tune badger gc to repeated gc the value log until there is no rewrite
2021-03-08 21:46:44 +02:00
vyzo
3bd77701d8
deduplicate code
2021-03-08 19:46:21 +02:00
vyzo
3d1b855f20
rename GC to CollectGarbage, ignore badger.ErrNoRewrite
2021-03-08 19:42:38 +02:00
vyzo
52de95d344
also gc in compactFull, not just compactSimple
2021-03-08 18:30:09 +02:00
vyzo
8562a9bb82
garbage collect hotstore after compaction
2021-03-08 18:12:09 +02:00
vyzo
e85391b46c
quiet stupid linter
2021-03-05 20:05:32 +02:00
vyzo
09f5ba177a
add splitstore unit test
2021-03-05 19:55:32 +02:00
vyzo
0a2f2cf00d
use the right condition for triggering the miss metric
2021-03-05 14:48:59 +02:00
vyzo
2b32c2e597
add some metrics
2021-03-05 14:48:57 +02:00
vyzo
99d21573da
remove DEBUG log spam
2021-03-05 14:46:18 +02:00
vyzo
c58df3f079
don't panic on compaction errors
2021-03-05 14:46:18 +02:00
vyzo
9bd009d795
use atomics to demarkate critical section and limit close delay
2021-03-05 14:46:18 +02:00
vyzo
17be7d3919
save markSetSize
2021-03-05 14:46:18 +02:00
vyzo
aff0f1ed4c
deduplicate code for batch deletion
2021-03-05 14:46:18 +02:00
vyzo
5fb6a907cb
fix loop condition in batch deletion
2021-03-05 14:46:18 +02:00
vyzo
98a7b884fe
implement DeleteMany in union blockstore
2021-03-05 14:46:18 +02:00
vyzo
fdd877534f
walk at boundary epoch, 2 finalities from current epoch, to find live objects
...
objects written after that are retained anyway.
2021-03-05 14:46:18 +02:00
vyzo
508fcb9d26
properly close snoop at shutdown
2021-03-05 14:46:18 +02:00
vyzo
47d8c87486
fix log
2021-03-05 14:46:18 +02:00
vyzo
11b2f41804
overestimate markSetSize a bit
2021-03-05 14:46:18 +02:00
vyzo
6b680d112b
do tracker purge in smaller batches
2021-03-05 14:46:18 +02:00
vyzo
d2d0980532
don't delete in one giant batch, use smaller chunks of batchSize
2021-03-05 14:46:18 +02:00
vyzo
70ebb2ad8d
improve startup log
2021-03-05 14:46:18 +02:00
vyzo
006c55a7c9
add startup log
2021-03-05 14:46:18 +02:00
vyzo
06d8ea10b1
batch delete during the cold purge
2021-03-05 14:46:18 +02:00
vyzo
4c05ec28ba
fix FromDatastore to not do double adapting
2021-03-05 14:46:18 +02:00
vyzo
ab52e34e6a
add comment
...
Co-authored-by: raulk <raul@protocol.ai>
2021-03-05 14:46:18 +02:00
vyzo
86fdad2e31
fix typo
...
Co-authored-by: raulk <raul@protocol.ai>
2021-03-05 14:46:18 +02:00
vyzo
2ff5aec80e
satisfy linter, use Prefix for common path of non inline CIDs
2021-03-05 14:46:18 +02:00
vyzo
8a55b73146
fix the situation with WrapIDStore
2021-03-05 14:46:18 +02:00
vyzo
86b73d651e
add DeleteMany to Blockstore interface
2021-03-05 14:46:18 +02:00
vyzo
c762536dcb
deduplicate code
2021-03-05 14:46:18 +02:00
vyzo
5184bc5c40
log consistency for full compaction
2021-03-05 14:46:18 +02:00
vyzo
68213a92cb
use ioutil.TempDir for test directories
2021-03-05 14:46:18 +02:00
vyzo
35d466d847
use sha256 for bloom key rehashing
2021-03-05 14:46:18 +02:00
vyzo
f651f43c5e
improve comment accuracy
2021-03-05 14:46:18 +02:00
Raúl Kripalani
4b1e1f4b52
rename liveset => markset; rename snoop => tracking store; docs.
2021-03-05 14:46:18 +02:00
vyzo
48f253328d
increase batch size to 16K
2021-03-05 14:46:18 +02:00
vyzo
ce68b9b229
batch writes during warm up
2021-03-05 14:46:18 +02:00
Raúl Kripalani
8cfba5b092
renames and polish.
2021-03-05 14:46:18 +02:00
Raúl Kripalani
b1b452bc0f
remove dependency from blockstore/splitstore => chain/store.
2021-03-05 14:46:18 +02:00
vyzo
b9400c590f
use crypto/rand for bloom salt
2021-03-05 14:46:18 +02:00
vyzo
e612fff1fe
also estimate liveset size during warm up
2021-03-05 14:46:18 +02:00
vyzo
748dd962d8
snake current tipset from head change notification
2021-03-05 14:46:18 +02:00
vyzo
cb36d5b6a4
warm up splitstore at first head change notification
2021-03-05 14:46:18 +02:00
Raúl Kripalani
1a804fbdec
move splitstore into blockstore package.
2021-03-05 14:46:18 +02:00
Raúl Kripalani
68b8e8e9cb
implement unionBlockstore#HashOnRead.
2021-03-02 21:22:24 +00:00
Raúl Kripalani
2047a74958
implement blockstore.Union, a union blockstore.
...
The union blockstore takes a list of blockstores. It returns the first
satisfying read, and broadcasts writes to all stores.
It can be used for operations that require reading from any two blockstores,
for example WalkSnapshot.
2021-03-02 17:03:11 +00:00
Raúl Kripalani
b34b4e0374
fix test compilation error.
2021-02-28 23:10:01 +00:00
Raúl Kripalani
3795cc2bd2
segregate chain and state blockstores.
...
This paves the way for better object lifetime management.
Concretely, it makes it possible to:
- have different stores backing chain and state data.
- having the same datastore library, but using different parameters.
- attach different caching layers/policies to each class of data, e.g.
sizing caches differently.
- specifying different retention policies for chain and state data.
This separation is important because:
- access patterns/frequency of chain and state data are different.
- state is derivable from chain, so one could never expunge the chain
store, and only retain state objects reachable from the last finality
in the state store.
2021-02-28 22:49:44 +00:00
Raúl Kripalani
853de3daf7
fix TimedCacheBlockstore#View.
2021-02-28 22:39:00 +00:00
Raúl Kripalani
45a650c012
remove unnecessary View casting.
2021-02-28 22:20:29 +00:00
Raúl Kripalani
7f0f7d0b36
Merge branch 'master' into refactor/lib/blockstore
2021-02-28 19:55:23 +00:00
Raúl Kripalani
8601e5da3a
address review comments.
2021-02-28 19:44:02 +00:00
Raúl Kripalani
35d1e3d1e0
refine docs.
2021-01-29 23:24:44 +00:00
Raúl Kripalani
d1104fec4c
rename blockstores for consistency.
2021-01-29 23:17:25 +00:00
Raúl Kripalani
b0cbc932bd
consolidate all blockstores in blockstore package.
2021-01-29 20:01:00 +00:00