Commit Graph

540 Commits

Author SHA1 Message Date
vyzo
f75d982c76 remove early occurs check from trackTxnRef
this happens inline, and it might block when using the badger markset
2021-07-23 12:47:18 +03:00
vyzo
2891a31c99 make badger markset concurrent close safe 2021-07-23 12:47:18 +03:00
vyzo
f2b7c3e6f2 reduce scope of exclusive lock in badger markset 2021-07-23 12:47:18 +03:00
vyzo
12c3432b8d document the "badger" markset type option 2021-07-23 12:47:18 +03:00
vyzo
be9530b66e finetune badger options 2021-07-23 12:47:18 +03:00
vyzo
2c26abc551 add badger markset test 2021-07-23 12:47:18 +03:00
vyzo
54a6968327 add badger-backed markset implementation 2021-07-23 12:47:18 +03:00
vyzo
5266b240b8 coalesce message and message receipt retention 2021-07-22 22:02:29 +03:00
vyzo
2a68ae8dad fix warmup by decoupling state from message receipt walk 2021-07-22 20:49:36 +03:00
Łukasz Magiera
da5aeda197
Merge branch 'master' into feat/splitstore-hot-messages 2021-07-22 12:58:06 +02:00
vyzo
c47fce8d2d test protector support 2021-07-20 09:02:45 +03:00
vyzo
ebbaf23af8 support out-of-chain reference protection 2021-07-20 09:02:40 +03:00
vyzo
006050ed27 implement hotstore message retention policy 2021-07-17 08:59:43 +03:00
vyzo
1b77361301 add option for hotstore message retention 2021-07-17 08:35:35 +03:00
vyzo
e003203bea implement exposed splitstore 2021-07-15 13:12:10 +03:00
vyzo
5a23f64b3b code reorg: break splitstore.go into smaller logical units 2021-07-14 13:11:15 -07:00
vyzo
3f3a12b75c remove BlockstoreMover interface
we decided it's premature
2021-07-14 22:59:53 +03:00
vyzo
023146803d use Broadcast for view barrier 2021-07-14 22:59:53 +03:00
vyzo
3d77ae1f4d make trackTxnRefMany consistent with trackTxnRef 2021-07-14 22:59:53 +03:00
vyzo
6f126c80bf remove redundant log, more descriptive error message for closing condition 2021-07-14 22:59:53 +03:00
vyzo
ff093fae00 use a missing compactionIndex as an indicator for warmup
so that splitstore v0 nodes upgrading will get a fresh warmup.
2021-07-14 22:59:53 +03:00
vyzo
669b47cfc9 do moving gc for hotstore every 20 compactions
that's about once a week
2021-07-14 22:59:53 +03:00
vyzo
818b8de182 keep track of the compaction serial (index)
it is useful so that:
- we only do slow (but very effective) moving gc every 10 compactions
- we can detect a splitstore v0 upgrade and re-warm up
2021-07-14 22:59:53 +03:00
vyzo
c93328b036 use the new traits for hotstore gc 2021-07-14 22:59:52 +03:00
vyzo
35180b4761 merge Compact and CollectGarbage in badger 2021-07-14 22:59:52 +03:00
vyzo
dc81c0e6a2 add blockstore traits related to gc 2021-07-14 22:59:52 +03:00
vyzo
af399529ec finetune view waiting 2021-07-13 09:06:40 +03:00
vyzo
257423e917 fix view waiting issues with the WaitGroup
We can add after Wait is called, which is problematic with WaitGroups.
This instead uses a mx/cond combo and waits while the count is > 0.
The only downside is that we might needlessly wait for (a bunch) of views
that started while the txn is active, but we can live with that.
2021-07-13 09:01:50 +03:00
Steven Allen
04abd190ab nit: remove useless goto
Because stebalien has allergies.
2021-07-12 21:46:50 -07:00
vyzo
60212c86cb put a mutex around HeadChange 2021-07-13 03:14:13 +03:00
vyzo
759594d01c always return the waitgroup in protectView
so that we preclude the following scenario:
    Start compaction.
    Start view.
    Finish compaction.
    Start compaction.

which would not wait for the view to complete.
2021-07-13 03:11:40 +03:00
vyzo
df9670c58d fix lint 2021-07-10 16:38:40 +03:00
vyzo
0c5e336ff1 address review comments 2021-07-10 16:30:27 +03:00
vyzo
870a47f55d handle id cids in internal versions of view/get 2021-07-09 20:07:17 +03:00
vyzo
f5ae10e3d1 refactor debug log code to eliminate duplication 2021-07-09 19:53:51 +03:00
vyzo
41290383e2 fix test 2021-07-09 19:24:44 +03:00
vyzo
b9a5ea8f7b update wording around discard store 2021-07-09 19:23:55 +03:00
vyzo
c0a1cfffa1 rename noopstore to discardstore 2021-07-09 19:19:37 +03:00
vyzo
18161fee38 remove unused lookback constructs 2021-07-09 19:12:58 +03:00
vyzo
095d7427ba make view protection optimistic again, as there is a race window 2021-07-09 15:41:10 +03:00
vyzo
da0feb3fa4 dont mark references inline; instad rely on the main compaction thread to do concurrent marking
The problem is that it is possible that an inline marking might take minutes for some objects
(infrequent, but still possible for state roots and prohibitive if that's a block validation).
So we simply track references continuously and rely on the main compaction thread to trigger
concurrent marking for all references at opportune moments.

Assumption: we can mark references faster than they are created during purge or else we'll
never purge anything.
2021-07-09 15:10:02 +03:00
vyzo
acc4c374ef properly handle protecting long-running views 2021-07-09 13:20:18 +03:00
vyzo
565faff754 fix test 2021-07-09 11:38:09 +03:00
vyzo
4f89d260b0 kill isOldBlockHeader; it's dangerous. 2021-07-09 11:35:10 +03:00
vyzo
de5e21bf1a correctly handle identity cids 2021-07-09 11:31:04 +03:00
vyzo
909f7039d4 make badger Close-safe 2021-07-09 09:54:12 +03:00
vyzo
abdf4a161a explicitly switch marksets for concurrent marking
this has very noticeable impact in initial marking time; it also allows us
to get rid of the confusing ts monikers.
2021-07-09 04:26:36 +03:00
vyzo
b6611125b6 add environment variables to turn on the debug log without recompiling 2021-07-08 21:30:39 +03:00
vyzo
60dd97c7fc fix potential deadlock in View
As pointed out by magik, it is possible to deadlock if the view callback performs
a blockstore operation while a Lock is pending.
This fixes the issue by optimistically tracking the reference before actually calling
the underlying View and limiting the scope of the lock.
2021-07-08 21:18:59 +03:00
vyzo
c0537848b3
fix typo
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
2021-07-08 17:54:16 +03:00
vyzo
fa30ac8c5d
fix typo
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
2021-07-08 17:53:59 +03:00
vyzo
00d7772f57 move check for closure in walkChain
so that we don't do it too often and also cover warmup.
2021-07-08 13:12:19 +03:00
vyzo
5cf1e09e81 README: add instructions for how to enable 2021-07-08 13:00:31 +03:00
vyzo
9aa4f3b3b2 add README for documentation 2021-07-08 12:32:41 +03:00
vyzo
e6eacbdd56 use RW mutexes in marksets 2021-07-08 10:20:29 +03:00
vyzo
48f13a43b7 intelligently close marksets and signal errors in concurrent operations 2021-07-08 10:18:43 +03:00
vyzo
f5c45bd517 check the closing state variable often
so that we have a reasonably quick graceful shutdown
2021-07-08 10:13:44 +03:00
vyzo
4f808367f8 fix lint 2021-07-07 21:32:58 +03:00
vyzo
fee50b13a2 check the closing state on each batch during the purge. 2021-07-07 21:32:05 +03:00
vyzo
c6421f8a75 don't nil the mark sets on close, it's dangerous.
a concurrent marking can panic.
2021-07-07 21:27:36 +03:00
vyzo
aec2ba2c82 nil map/bf on markset close 2021-07-07 16:46:14 +03:00
vyzo
451ddf50ab RIP bbolt-backed markset 2021-07-07 16:39:37 +03:00
vyzo
9dbb2e0abd don't leak tracking errors through the API 2021-07-07 16:34:02 +03:00
vyzo
83c30dc4c0 protect assignment of warmup epoch with the mutex 2021-07-07 11:31:27 +03:00
vyzo
6cc2112749 remove the curTs state variable; we don't need it 2021-07-07 09:55:25 +03:00
vyzo
05dbbe9681 rename som Txn methods for better readability 2021-07-07 09:52:31 +03:00
vyzo
90da6227b3 transactional protect incoming tipsets 2021-07-07 02:11:37 +03:00
vyzo
0e2af11f6a prepare the transaction before launching the compaction goroutine 2021-07-07 01:39:58 +03:00
vyzo
f2f4af669d clean up: simplify debug log, get rid of ugly debug log 2021-07-06 17:13:38 +03:00
vyzo
c1c25868cc improve comments 2021-07-06 15:09:04 +03:00
vyzo
fdff1bebc9 move map markset implementation to its own file 2021-07-06 14:44:40 +03:00
vyzo
5c514504f7 remove unused GetGenesis method from ChainAccessor interface 2021-07-06 14:41:41 +03:00
vyzo
dc8139a1d2 add some comments for debug only code 2021-07-06 13:23:12 +03:00
vyzo
c4ae3e0c3d minor tweak 2021-07-06 09:17:35 +03:00
vyzo
169ab262f5 really optimize computing object weights
sort is still taking a long time, this should be as fast as it gets.
2021-07-06 09:02:44 +03:00
vyzo
55a9e0ccd1 short-circuit block headers on sort weight computation 2021-07-06 08:22:43 +03:00
vyzo
bf7aeb3167 optimize sort a tad
it's taking a long time to compute weights...
2021-07-06 08:10:57 +03:00
vyzo
0659235e21 cache cid strings in sort
so as to avoid making a gazillion of strings
2021-07-06 07:26:13 +03:00
vyzo
525a2c71dd use hashes as keys in weight map to avoid duplicate work
otherwise the root object will be raw, but internal references will be dag; duplicate work.
2021-07-06 01:27:56 +03:00
vyzo
c6ad8fdaed use walkObjectRaw for computing object weights
cids that come out of the hotstore with ForEach are raw.
2021-07-06 01:08:44 +03:00
vyzo
2cbd3faf5a make sure to nil everything in txnEndProtect 2021-07-05 23:56:31 +03:00
vyzo
51ab891d5c quiet linter
it's a false positive, function doesn't escape.
2021-07-05 23:53:45 +03:00
vyzo
bd436ab9de make endTxnProtect idempotent 2021-07-05 23:51:10 +03:00
vyzo
e859942fa4 code cleanup: refactor txn state code into their own functions 2021-07-05 23:31:37 +03:00
vyzo
3477d265c6 unify the two marksets
really, it's concurrent marking and there is no reason to have two different marksets
2021-07-05 20:10:47 +03:00
vyzo
73d07999bf dont needlessly wait 1 min in first retry for missing refs 2021-07-05 18:24:48 +03:00
vyzo
af8cf712be handle all missing refs together
so that we wait 6min at most, not 12.
2021-07-05 18:16:54 +03:00
vyzo
5a099b7d05 more commentary on the missing refs situation 2021-07-05 16:12:17 +03:00
vyzo
59639a0788 reinstate some better code for handling missing references. 2021-07-05 16:08:08 +03:00
vyzo
fa195bede2 get rid of ugly missing reference handling code
those missing objects don't seem to ever get there, are they from an abandoned fork?
2021-07-05 14:29:55 +03:00
vyzo
59936ef468 fix log 2021-07-05 13:30:31 +03:00
vyzo
0b7153be86 use internal version of has for occurs checks 2021-07-05 12:41:11 +03:00
vyzo
d8b8d75e0f readd minute delay before trying for missing objects 2021-07-05 12:38:09 +03:00
vyzo
d7709deb2b reduce memory pressure from marksets when the size is decreased 2021-07-05 11:51:22 +03:00
vyzo
3ec834b2e3 improve logs and error messages 2021-07-05 11:41:09 +03:00
vyzo
918a7ec749 a bit more fil commitment short-circuiting 2021-07-05 11:38:53 +03:00
vyzo
2ea2abc07d short-circuit fil commitments
they don't make it to the blockstore anyway
2021-07-05 11:32:52 +03:00
vyzo
839f7bd2b5 only occur check for DAGs 2021-07-05 11:11:08 +03:00
vyzo
c81ae5fc20 add some comments about the missing business and anothre log 2021-07-05 10:42:14 +03:00
vyzo
4c41f52828 add warning for missing objects for marking for debug purposes 2021-07-05 10:35:04 +03:00
vyzo
3597192d58 remove the sleeps and busy loop more times when waiting for missing objects 2021-07-05 10:31:47 +03:00
vyzo
1726eb993c deal with incomplete objects that need to be marked and protected
seems that something is writing DAGs before its consituents, which causes problems.
2021-07-05 10:22:52 +03:00
vyzo
db53859e7a reduce CompactionThreshold to 5 finalities
so that we run compaction every finality, once we've first compacted
2021-07-04 22:12:51 +03:00
vyzo
b08e0b7102 fix lint 2021-07-04 21:24:15 +03:00
vyzo
94efae419e reduce length of critical section
Just the purge; the rest is not critical -- e.g. it's ok if we do some duplicate copies
to the coldstore, we'll have gc soon.
2021-07-04 21:21:53 +03:00
vyzo
f33d4e79aa simplify transactional protection logic
Now that we delete objects heaviest first, we don't have to do deep walk and rescan gymnastics.
2021-07-04 20:49:39 +03:00
vyzo
40c271cda1 sort cold objects before deleting
so that we can't shoot ourselves in the foot by deleting the constituents of a DAG while it is
still in the hotstore.
2021-07-04 20:17:07 +03:00
vyzo
13d612f72f smarter trackTxnRefMany 2021-07-04 19:33:49 +03:00
vyzo
f124389b66 recursively protect all references 2021-07-04 19:21:00 +03:00
vyzo
4d286da593 fix error message 2021-07-04 18:58:39 +03:00
vyzo
680af8eb09 use deep object walking for more robust handling of transactional references 2021-07-04 18:38:28 +03:00
vyzo
1f02428225 fix lint 2021-07-04 18:38:28 +03:00
vyzo
2c7a89a1db short-circuit rescanning on block headers 2021-07-04 18:38:28 +03:00
vyzo
028a5c4942 make test do something useful again 2021-07-04 18:38:28 +03:00
vyzo
8e56fffb33 walkChain should visit the genesis state root 2021-07-04 18:38:28 +03:00
vyzo
95c3aaec9a fix test 2021-07-04 18:38:28 +03:00
vyzo
190cb18ab0 housekeeping
- remove defunct tracking store implementations
- update splitstore node config
- use mark set type config option (defaulting to mapts); a memory constrained node
  may want to use an on-disk one
2021-07-04 18:38:28 +03:00
vyzo
19d1b1f532 deal with partially written objects 2021-07-04 18:38:28 +03:00
vyzo
0a1d7b3732 fix log 2021-07-04 18:38:28 +03:00
vyzo
08cad30be2 reuse key buffer in badger ForEachKey
cid copies the bytes so it's safe
2021-07-04 18:38:28 +03:00
vyzo
eafffc1634 more efficient trackTxnRefMany 2021-07-04 18:38:28 +03:00
vyzo
36f93649ef fix panic from concurrent map writes in txnRefs 2021-07-04 18:38:28 +03:00
vyzo
6fa2cd232d simplify compaction model 2021-07-04 18:38:28 +03:00
vyzo
1f2b604c07 RIP tracking store 2021-07-04 18:38:28 +03:00
vyzo
d476a3db2c BlockstoreIterator trait with implementation for badger 2021-07-04 18:38:28 +03:00
vyzo
68a83500bc fix bug that turned candidate filtering to dead code 2021-07-04 18:38:28 +03:00
vyzo
00fcf6dd72 add staging cache to bolt tracking store 2021-07-04 18:38:28 +03:00
vyzo
642f0e4740 deal with memory pressure, don't walk under the boundary 2021-07-04 18:38:28 +03:00
vyzo
c5cf8e226b remove unnecessary code 2021-07-04 18:38:28 +03:00
vyzo
d79e4da7aa more accurate stats about mark set updates 2021-07-04 18:38:28 +03:00
vyzo
6f58fdcb22 remove vm copy context detection hack
stack tracing is slow.
2021-07-04 18:38:28 +03:00
vyzo
2b03316cd9 fix log message 2021-07-04 18:38:28 +03:00
vyzo
184d3802b6 remove dead code 2021-07-04 18:38:28 +03:00
vyzo
228a435ba7 rework tracking logic; do it lazily and far more efficiently 2021-07-04 18:38:28 +03:00
vyzo
9d6cabd18a if it's not a dag, it's not a block 2021-07-04 18:38:28 +03:00
vyzo
8157f889ce short-circuit marking walks when encountering a block and more efficient walking 2021-07-04 18:38:28 +03:00
vyzo
736d6a3c19 only treat Has as an implicit write within vm.Copy context 2021-07-04 18:38:28 +03:00
vyzo
39723bbe60 use a single map for tracking pending writes, properly track implicits 2021-07-04 18:38:28 +03:00
vyzo
5834231e58 create the transactional protect filter before walking 2021-07-04 18:38:28 +03:00
vyzo
e4bb4be855 fix some residual purge races 2021-07-04 18:38:28 +03:00
vyzo
68bc5d2291 skip moving cold blocks when running with a noop coldstore
it is a noop but it still takes (a lot of) time because it has to read all the cold blocks.
2021-07-04 18:38:28 +03:00
vyzo
b87295db93 bubble up dependent txn ref errors
This cause Has to return false if it fails to traverse/protect all links, which would cause
the vm to recompute.
2021-07-04 18:38:28 +03:00
vyzo
637fbf6c5b fix faulty if/else logic for implicit txn protection 2021-07-04 18:38:28 +03:00
vyzo
9d6bcd7705 avoid clown shoes: only walk links for tracking in implicit writes/refs 2021-07-04 18:38:28 +03:00
vyzo
484dfaebce reused cidset across all walks when flushing pending writes 2021-07-04 18:38:28 +03:00
vyzo
1d41e1544a optimize transitive write tracking a bit 2021-07-04 18:38:28 +03:00
vyzo
da00fc66ee downgrade a couple of logs to warnings 2021-07-04 18:38:28 +03:00
vyzo
4071488ef2 first write, then track 2021-07-04 18:38:28 +03:00
vyzo
bd92c230da refactor txn reference tracking, do deep marking of DAGs 2021-07-04 18:38:28 +03:00
vyzo
a98a062347 do the dag walk for deep write tracking during flush
avoid crawling everything to a halt
2021-07-04 18:38:28 +03:00
vyzo
13a674330f add pending write check before tracking the object in Has 2021-07-04 18:38:28 +03:00
vyzo
982867317e transitively track dags from implicit writes in Has 2021-07-04 18:38:28 +03:00
vyzo
4de0cd9fcb move write log back to flush so that we don't crawl to a halt 2021-07-04 18:38:28 +03:00
vyzo
b3ddaa5f02 fix panic at startup
genesis is written (!) before starting the splitstore, so curTs is nil
2021-07-04 18:38:28 +03:00
vyzo
2faa4aa993 debug log writes at track so that we get correct stack traces 2021-07-04 18:38:28 +03:00
vyzo
aeaa59d4b5 move comments about tracking perf issues into a more pertinent place 2021-07-04 18:38:28 +03:00
vyzo
3e8e9273ca track all writes using async batching, not just implicit ones 2021-07-04 18:38:28 +03:00
vyzo
d0bfe421b5 flush implicit writes at the right time before starting compaction to avoid races 2021-07-04 18:38:28 +03:00
vyzo
7f473f56eb flush implicit writes before starting compaction 2021-07-04 18:38:28 +03:00
vyzo
a29947d47c flush implicit writes in all paths in updateWriteEpoch 2021-07-04 18:38:28 +03:00
vyzo
be6cc2c3e6 batch implicit write tracking
bolt performance leaves something to be desired; doing a single Put takes 10ms, about the same time
as batching thousands of them.
2021-07-04 18:38:28 +03:00
vyzo
e472cacb3e add missing return 2021-07-04 18:38:28 +03:00
vyzo
6a3cbea790 treat Has as an implicit Write
Rationale: the VM uses the Has check to avoid issuing a duplicate Write in the blockstore.
This means that live objects that would be otherwise written are not actually written, resulting
in the first write epoch being considered the write epoch.
2021-07-04 18:38:28 +03:00
vyzo
f97535d87e store the hash in map markset 2021-07-04 18:38:28 +03:00
vyzo
90dc274113 better logging for chain walk 2021-07-04 18:38:28 +03:00
vyzo
40f42db7fa walk tweaks 2021-07-04 18:38:28 +03:00
vyzo
09efed50fd check for lookback references to block headers in walk 2021-07-04 18:38:28 +03:00
vyzo
7de0771883 count txn live objects explicitly for logging 2021-07-04 18:38:28 +03:00
vyzo
e29b64c5de check both markset and txn liveset before declaring an object cold 2021-07-04 18:38:28 +03:00
vyzo
4bed3161f0 fix broken purge count log 2021-07-04 18:38:28 +03:00
vyzo
7307eb54dc cache stack repr computation 2021-07-04 18:38:28 +03:00
vyzo
57e25ae1cd use succint timetamp in debug logs 2021-07-04 18:38:28 +03:00
vyzo
b2b13bbe89 fix debug panic 2021-07-04 18:38:28 +03:00
vyzo
0b315e97c8 fix index out of range 2021-07-04 18:38:28 +03:00
vyzo
dec61fa333 deduplicate stack logs and optionally trace write stacks 2021-07-04 18:38:28 +03:00
vyzo
7ebef6d838 better log message 2021-07-04 18:38:28 +03:00
vyzo
40ff5bf164 log put errors in splitstore log 2021-07-04 18:38:28 +03:00
vyzo
9fda61abec fix error check for unreachable cids 2021-07-04 18:38:28 +03:00
vyzo
4a71c68e06 move code around for better readability 2021-07-04 18:38:28 +03:00
vyzo
31497f4bd3 use internal get during walk to avoid blowing the compaction txn
otherwise the walk itself precludes purge... duh!
2021-07-04 18:38:28 +03:00
vyzo
6af3a23dd4 use a map for txn protection mark set 2021-07-04 18:38:28 +03:00
vyzo
65ccc99e79 minor tweaks in purge
- allocate once
- log purge count
2021-07-04 18:38:28 +03:00
vyzo
cb665d07e0 fix transactional race during compaction
It is possible for an object to be written or recreated (and checked with Has)
after the mark completes and during the purge; if this happens we will purge
a live block.
2021-07-04 18:38:28 +03:00
vyzo
50ebaf25aa don't log read misses before warmup 2021-07-04 18:38:28 +03:00
vyzo
375a1790e7 reset counters after flush 2021-07-04 18:38:28 +03:00
vyzo
b187b5c301 fix lint 2021-07-04 18:38:28 +03:00
vyzo
a53c4e1597 implement debug log 2021-07-04 18:38:28 +03:00
vyzo
fce7b8dc9b flush move log when cold collection is done 2021-07-04 18:38:28 +03:00
vyzo
fc247e4223 add debug log skeleton 2021-07-04 18:38:28 +03:00
vyzo
0390285c4e always do full walks, not only when there is a sync gap 2021-07-04 18:38:28 +03:00
vyzo
30dbe4978b adjust compaction range 2021-07-04 18:38:28 +03:00
vyzo
a21f55919b CompactionThreshold should be 4 finalities
otherwise we'll wear clown shoes with the slack and end up in continuous compaction.
2021-07-04 18:38:28 +03:00
vyzo
79d21489df fix test 2021-07-04 18:38:28 +03:00
vyzo
a25ac80777 reintroduce compaction slack 2021-07-04 18:38:28 +03:00
vyzo
a178c1fb93 fix test 2021-07-04 18:38:28 +03:00
vyzo
c4d95de987 coalesce back-to-back compactions
get rid of the CompactionCold construct, run a single compaction on catch up
2021-07-04 18:38:28 +03:00
vyzo
b7897595eb augment current epoch by +1
to account for off by one conditions
2021-07-04 18:38:28 +03:00
vyzo
933c786421 update write epoch in the background every second 2021-07-04 18:38:28 +03:00
vyzo
66f1630f14 fix lint issue 2021-07-04 18:38:28 +03:00
vyzo
bb17608ae0 track writeEpoch relative to current wall clock time
The issue: head change notifications are not emitted until after catching up,
which results in all writes during a catch up period being tracked at the base epoch.
2021-07-04 18:38:28 +03:00
vyzo
421f05eab9 save the warm up epoch only if successful in warming up 2021-07-04 18:38:28 +03:00
vyzo
9b6448518c refactor warmup to trigger at startup and not wait for sync 2021-07-04 18:38:28 +03:00
vyzo
3fe4261f12 don't attempt compaction while still syncing 2021-07-04 18:38:28 +03:00
vyzo
7b02673620 don't try to visit genesis parent blocks 2021-07-04 18:38:28 +03:00
vyzo
997f2c098b keep headers hot when running with a noop splitstore 2021-07-04 18:38:28 +03:00
vyzo
7c814cd2e3 refactor genesis state loading code into its own method 2021-07-04 18:38:28 +03:00
vyzo
41573f1fb2 also walk parent message receipts when including messages in the walk 2021-07-04 18:38:28 +03:00
vyzo
fa6481401d reduce SyncGapTime to 1 minute
for maximal safety.
2021-07-04 18:38:28 +03:00
vyzo
fda291b876 fix test 2021-07-04 18:38:28 +03:00
vyzo
d33a44e67f first visit the cid, then short-circuit non dagcbor objects 2021-07-04 18:38:28 +03:00
vyzo
bdb97d6186 more robust handling of sync gap walks 2021-07-04 18:38:28 +03:00
vyzo
7cf75e667d keep genesis-linked state hot 2021-07-04 18:38:28 +03:00
vyzo
e9f531b4aa don't open bolt tracking store with NoSync, it might get corrupted 2021-07-04 18:38:28 +03:00
Raúl Kripalani
b2b7eb2ded metrics: increment misses in View(). 2021-07-04 18:38:28 +03:00
vyzo
3a9b7c592d mark from current epoch to boundary epoch when necessary
this is necessary to avoid wearing clown shoes when the node stays
offline for an extended period of time (more than 1 finality).

Basically it gets quite slow if we do the full 2 finality walk, so we
try to avoid it unless necessary.
The conditions under which a full walk is necessary is if there is a
sync gap (most likely because the node was offline) during which the
tracking of writes is inaccurate because we have not yet delivered the
HeadChange notification.  In this case, it is possible to have
actually hot blocks to be tracked before the boundary and fail to mark
them accordingly.  So when we detect a sync gap, we do the full walk;
if there is no sync gap, we can just use the much faster boundary
epoch walk.
2021-07-04 18:38:28 +03:00
vyzo
d7ceef104e decrease CompactionThreshold to 3 finalities 2021-07-04 18:38:28 +03:00
vyzo
e3cbeec6ee implement chain walking 2021-07-04 18:38:28 +03:00
vyzo
04f2e102a1 kill full splitstore compaction, simplify splitstore configuration 2021-07-04 18:38:28 +03:00
vyzo
4d3c73f4ca noop blockstore 2021-07-04 18:38:28 +03:00
Steven Allen
f353a794cb
fix: spelling
Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
2021-04-28 22:08:37 -07:00
Steven Allen
63db9e1633 fix(splitstore): fix a panic on revert-only head changes
Calling, e.g., `lotus chain sethead` on an ancestor tipset won't apply
any new blocks, it'll just revert a bunch. This will lead to HeadChange
calls with no new blocks to apply.

fixes #6125
2021-04-28 20:35:30 -07:00
Peter Rabbitson
a5fd552a0c CachedBlockstore is not referenced as of 3795cc2bd2 2021-04-06 13:24:49 +02:00
Łukasz Magiera
5f80869fe0
Merge pull request #5794 from filecoin-project/fix/atomic-first
fix: make sure atomic 64bit fields are 64bit aligned
2021-03-25 13:29:59 +01:00
Steven Allen
25725110e7
Merge pull request #5792 from filecoin-project/fix/timed-cache-locking
fix: avoid holding a lock while calling the View callback
2021-03-12 09:25:00 -08:00
Łukasz Magiera
c69b26cfc6
Merge pull request #5778 from filecoin-project/feat/splitstore-compact-hotstore
splitstore: compact hotstore prior to garbage collection
2021-03-12 16:30:05 +01:00
Steven Allen
bba71da401 fix: return buffers after canceling badger operation
In theory, Delete/Put could fail. If it does, we'll return the buffers
to the pool before we're really done with them.

In practice, this is almost certainly not an issue as badger shouldn't
_use_ the buffer unless we flush. But I feel slightly safer this way.
2021-03-11 20:30:43 -08:00
Steven Allen
1c490f3fda fix: make sure atomic 64bit fields are 64bit aligned
Otherwise, this won't work on 32bit ARM.
2021-03-11 20:10:39 -08:00
Steven Allen
a888ea0d1f fix: avoid holding a lock while calling the View callback
Interleaved puts/views could get really slow and there's no real reason
to use view under the covers here because the underlying blockstore is
always a "memstore".
2021-03-11 20:03:38 -08:00
vyzo
1b1d3606cd make linter happy 2021-03-11 13:10:44 +02:00
vyzo
353bb1881f compact hotstore if it provides the method 2021-03-11 11:45:19 +02:00
vyzo
01ce9b5c44 add Compact to badger blockstore 2021-03-11 11:45:05 +02:00
vyzo
ae6410d02f use compacting atomic to make the test deterministic 2021-03-09 09:05:36 +02:00
Steven Allen
6d2e8d721d test: attempt to make the splitstore test deterministic
At a minimum, make it thread-safe.
2021-03-08 16:36:25 -08:00
vyzo
90741da019 tune badger gc to repeated gc the value log until there is no rewrite 2021-03-08 21:46:44 +02:00
vyzo
3bd77701d8 deduplicate code 2021-03-08 19:46:21 +02:00
vyzo
3d1b855f20 rename GC to CollectGarbage, ignore badger.ErrNoRewrite 2021-03-08 19:42:38 +02:00
vyzo
52de95d344 also gc in compactFull, not just compactSimple 2021-03-08 18:30:09 +02:00
vyzo
8562a9bb82 garbage collect hotstore after compaction 2021-03-08 18:12:09 +02:00
vyzo
e85391b46c quiet stupid linter 2021-03-05 20:05:32 +02:00
vyzo
09f5ba177a add splitstore unit test 2021-03-05 19:55:32 +02:00
vyzo
0a2f2cf00d use the right condition for triggering the miss metric 2021-03-05 14:48:59 +02:00
vyzo
2b32c2e597 add some metrics 2021-03-05 14:48:57 +02:00
vyzo
99d21573da remove DEBUG log spam 2021-03-05 14:46:18 +02:00
vyzo
c58df3f079 don't panic on compaction errors 2021-03-05 14:46:18 +02:00
vyzo
9bd009d795 use atomics to demarkate critical section and limit close delay 2021-03-05 14:46:18 +02:00
vyzo
17be7d3919 save markSetSize 2021-03-05 14:46:18 +02:00
vyzo
aff0f1ed4c deduplicate code for batch deletion 2021-03-05 14:46:18 +02:00
vyzo
5fb6a907cb fix loop condition in batch deletion 2021-03-05 14:46:18 +02:00
vyzo
98a7b884fe implement DeleteMany in union blockstore 2021-03-05 14:46:18 +02:00
vyzo
fdd877534f walk at boundary epoch, 2 finalities from current epoch, to find live objects
objects written after that are retained anyway.
2021-03-05 14:46:18 +02:00
vyzo
508fcb9d26 properly close snoop at shutdown 2021-03-05 14:46:18 +02:00
vyzo
47d8c87486 fix log 2021-03-05 14:46:18 +02:00
vyzo
11b2f41804 overestimate markSetSize a bit 2021-03-05 14:46:18 +02:00
vyzo
6b680d112b do tracker purge in smaller batches 2021-03-05 14:46:18 +02:00
vyzo
d2d0980532 don't delete in one giant batch, use smaller chunks of batchSize 2021-03-05 14:46:18 +02:00
vyzo
70ebb2ad8d improve startup log 2021-03-05 14:46:18 +02:00
vyzo
006c55a7c9 add startup log 2021-03-05 14:46:18 +02:00
vyzo
06d8ea10b1 batch delete during the cold purge 2021-03-05 14:46:18 +02:00
vyzo
4c05ec28ba fix FromDatastore to not do double adapting 2021-03-05 14:46:18 +02:00
vyzo
ab52e34e6a add comment
Co-authored-by: raulk <raul@protocol.ai>
2021-03-05 14:46:18 +02:00
vyzo
86fdad2e31 fix typo
Co-authored-by: raulk <raul@protocol.ai>
2021-03-05 14:46:18 +02:00
vyzo
2ff5aec80e satisfy linter, use Prefix for common path of non inline CIDs 2021-03-05 14:46:18 +02:00
vyzo
8a55b73146 fix the situation with WrapIDStore 2021-03-05 14:46:18 +02:00
vyzo
86b73d651e add DeleteMany to Blockstore interface 2021-03-05 14:46:18 +02:00
vyzo
c762536dcb deduplicate code 2021-03-05 14:46:18 +02:00
vyzo
5184bc5c40 log consistency for full compaction 2021-03-05 14:46:18 +02:00
vyzo
68213a92cb use ioutil.TempDir for test directories 2021-03-05 14:46:18 +02:00
vyzo
35d466d847 use sha256 for bloom key rehashing 2021-03-05 14:46:18 +02:00
vyzo
f651f43c5e improve comment accuracy 2021-03-05 14:46:18 +02:00
Raúl Kripalani
4b1e1f4b52 rename liveset => markset; rename snoop => tracking store; docs. 2021-03-05 14:46:18 +02:00
vyzo
48f253328d increase batch size to 16K 2021-03-05 14:46:18 +02:00
vyzo
ce68b9b229 batch writes during warm up 2021-03-05 14:46:18 +02:00
Raúl Kripalani
8cfba5b092 renames and polish. 2021-03-05 14:46:18 +02:00
Raúl Kripalani
b1b452bc0f remove dependency from blockstore/splitstore => chain/store. 2021-03-05 14:46:18 +02:00
vyzo
b9400c590f use crypto/rand for bloom salt 2021-03-05 14:46:18 +02:00
vyzo
e612fff1fe also estimate liveset size during warm up 2021-03-05 14:46:18 +02:00
vyzo
748dd962d8 snake current tipset from head change notification 2021-03-05 14:46:18 +02:00
vyzo
cb36d5b6a4 warm up splitstore at first head change notification 2021-03-05 14:46:18 +02:00
Raúl Kripalani
1a804fbdec move splitstore into blockstore package. 2021-03-05 14:46:18 +02:00
Raúl Kripalani
68b8e8e9cb implement unionBlockstore#HashOnRead. 2021-03-02 21:22:24 +00:00
Raúl Kripalani
2047a74958 implement blockstore.Union, a union blockstore.
The union blockstore takes a list of blockstores. It returns the first
satisfying read, and broadcasts writes to all stores.

It can be used for operations that require reading from any two blockstores,
for example WalkSnapshot.
2021-03-02 17:03:11 +00:00
Raúl Kripalani
b34b4e0374 fix test compilation error. 2021-02-28 23:10:01 +00:00
Raúl Kripalani
3795cc2bd2 segregate chain and state blockstores.
This paves the way for better object lifetime management.

Concretely, it makes it possible to:
- have different stores backing chain and state data.
- having the same datastore library, but using different parameters.
- attach different caching layers/policies to each class of data, e.g.
  sizing caches differently.
- specifying different retention policies for chain and state data.

This separation is important because:
- access patterns/frequency of chain and state data are different.
- state is derivable from chain, so one could never expunge the chain
  store, and only retain state objects reachable from the last finality
  in the state store.
2021-02-28 22:49:44 +00:00
Raúl Kripalani
853de3daf7 fix TimedCacheBlockstore#View. 2021-02-28 22:39:00 +00:00
Raúl Kripalani
45a650c012 remove unnecessary View casting. 2021-02-28 22:20:29 +00:00
Raúl Kripalani
7f0f7d0b36 Merge branch 'master' into refactor/lib/blockstore 2021-02-28 19:55:23 +00:00
Raúl Kripalani
8601e5da3a address review comments. 2021-02-28 19:44:02 +00:00
Raúl Kripalani
35d1e3d1e0 refine docs. 2021-01-29 23:24:44 +00:00
Raúl Kripalani
d1104fec4c rename blockstores for consistency. 2021-01-29 23:17:25 +00:00
Raúl Kripalani
b0cbc932bd consolidate all blockstores in blockstore package. 2021-01-29 20:01:00 +00:00