Commit Graph

246 Commits

Author SHA1 Message Date
vyzo
759594d01c always return the waitgroup in protectView
so that we preclude the following scenario:
    Start compaction.
    Start view.
    Finish compaction.
    Start compaction.

which would not wait for the view to complete.
2021-07-13 03:11:40 +03:00
vyzo
df9670c58d fix lint 2021-07-10 16:38:40 +03:00
vyzo
0c5e336ff1 address review comments 2021-07-10 16:30:27 +03:00
vyzo
870a47f55d handle id cids in internal versions of view/get 2021-07-09 20:07:17 +03:00
vyzo
18161fee38 remove unused lookback constructs 2021-07-09 19:12:58 +03:00
vyzo
095d7427ba make view protection optimistic again, as there is a race window 2021-07-09 15:41:10 +03:00
vyzo
da0feb3fa4 dont mark references inline; instad rely on the main compaction thread to do concurrent marking
The problem is that it is possible that an inline marking might take minutes for some objects
(infrequent, but still possible for state roots and prohibitive if that's a block validation).
So we simply track references continuously and rely on the main compaction thread to trigger
concurrent marking for all references at opportune moments.

Assumption: we can mark references faster than they are created during purge or else we'll
never purge anything.
2021-07-09 15:10:02 +03:00
vyzo
acc4c374ef properly handle protecting long-running views 2021-07-09 13:20:18 +03:00
vyzo
4f89d260b0 kill isOldBlockHeader; it's dangerous. 2021-07-09 11:35:10 +03:00
vyzo
de5e21bf1a correctly handle identity cids 2021-07-09 11:31:04 +03:00
vyzo
abdf4a161a explicitly switch marksets for concurrent marking
this has very noticeable impact in initial marking time; it also allows us
to get rid of the confusing ts monikers.
2021-07-09 04:26:36 +03:00
vyzo
b6611125b6 add environment variables to turn on the debug log without recompiling 2021-07-08 21:30:39 +03:00
vyzo
60dd97c7fc fix potential deadlock in View
As pointed out by magik, it is possible to deadlock if the view callback performs
a blockstore operation while a Lock is pending.
This fixes the issue by optimistically tracking the reference before actually calling
the underlying View and limiting the scope of the lock.
2021-07-08 21:18:59 +03:00
vyzo
c0537848b3
fix typo
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
2021-07-08 17:54:16 +03:00
vyzo
fa30ac8c5d
fix typo
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
2021-07-08 17:53:59 +03:00
vyzo
00d7772f57 move check for closure in walkChain
so that we don't do it too often and also cover warmup.
2021-07-08 13:12:19 +03:00
vyzo
f5c45bd517 check the closing state variable often
so that we have a reasonably quick graceful shutdown
2021-07-08 10:13:44 +03:00
vyzo
fee50b13a2 check the closing state on each batch during the purge. 2021-07-07 21:32:05 +03:00
vyzo
451ddf50ab RIP bbolt-backed markset 2021-07-07 16:39:37 +03:00
vyzo
9dbb2e0abd don't leak tracking errors through the API 2021-07-07 16:34:02 +03:00
vyzo
83c30dc4c0 protect assignment of warmup epoch with the mutex 2021-07-07 11:31:27 +03:00
vyzo
6cc2112749 remove the curTs state variable; we don't need it 2021-07-07 09:55:25 +03:00
vyzo
05dbbe9681 rename som Txn methods for better readability 2021-07-07 09:52:31 +03:00
vyzo
90da6227b3 transactional protect incoming tipsets 2021-07-07 02:11:37 +03:00
vyzo
0e2af11f6a prepare the transaction before launching the compaction goroutine 2021-07-07 01:39:58 +03:00
vyzo
f2f4af669d clean up: simplify debug log, get rid of ugly debug log 2021-07-06 17:13:38 +03:00
vyzo
c1c25868cc improve comments 2021-07-06 15:09:04 +03:00
vyzo
5c514504f7 remove unused GetGenesis method from ChainAccessor interface 2021-07-06 14:41:41 +03:00
vyzo
dc8139a1d2 add some comments for debug only code 2021-07-06 13:23:12 +03:00
vyzo
c4ae3e0c3d minor tweak 2021-07-06 09:17:35 +03:00
vyzo
169ab262f5 really optimize computing object weights
sort is still taking a long time, this should be as fast as it gets.
2021-07-06 09:02:44 +03:00
vyzo
55a9e0ccd1 short-circuit block headers on sort weight computation 2021-07-06 08:22:43 +03:00
vyzo
bf7aeb3167 optimize sort a tad
it's taking a long time to compute weights...
2021-07-06 08:10:57 +03:00
vyzo
0659235e21 cache cid strings in sort
so as to avoid making a gazillion of strings
2021-07-06 07:26:13 +03:00
vyzo
525a2c71dd use hashes as keys in weight map to avoid duplicate work
otherwise the root object will be raw, but internal references will be dag; duplicate work.
2021-07-06 01:27:56 +03:00
vyzo
c6ad8fdaed use walkObjectRaw for computing object weights
cids that come out of the hotstore with ForEach are raw.
2021-07-06 01:08:44 +03:00
vyzo
2cbd3faf5a make sure to nil everything in txnEndProtect 2021-07-05 23:56:31 +03:00
vyzo
51ab891d5c quiet linter
it's a false positive, function doesn't escape.
2021-07-05 23:53:45 +03:00
vyzo
bd436ab9de make endTxnProtect idempotent 2021-07-05 23:51:10 +03:00
vyzo
e859942fa4 code cleanup: refactor txn state code into their own functions 2021-07-05 23:31:37 +03:00
vyzo
3477d265c6 unify the two marksets
really, it's concurrent marking and there is no reason to have two different marksets
2021-07-05 20:10:47 +03:00
vyzo
73d07999bf dont needlessly wait 1 min in first retry for missing refs 2021-07-05 18:24:48 +03:00
vyzo
af8cf712be handle all missing refs together
so that we wait 6min at most, not 12.
2021-07-05 18:16:54 +03:00
vyzo
5a099b7d05 more commentary on the missing refs situation 2021-07-05 16:12:17 +03:00
vyzo
59639a0788 reinstate some better code for handling missing references. 2021-07-05 16:08:08 +03:00
vyzo
fa195bede2 get rid of ugly missing reference handling code
those missing objects don't seem to ever get there, are they from an abandoned fork?
2021-07-05 14:29:55 +03:00
vyzo
59936ef468 fix log 2021-07-05 13:30:31 +03:00
vyzo
0b7153be86 use internal version of has for occurs checks 2021-07-05 12:41:11 +03:00
vyzo
d8b8d75e0f readd minute delay before trying for missing objects 2021-07-05 12:38:09 +03:00
vyzo
d7709deb2b reduce memory pressure from marksets when the size is decreased 2021-07-05 11:51:22 +03:00
vyzo
3ec834b2e3 improve logs and error messages 2021-07-05 11:41:09 +03:00
vyzo
918a7ec749 a bit more fil commitment short-circuiting 2021-07-05 11:38:53 +03:00
vyzo
2ea2abc07d short-circuit fil commitments
they don't make it to the blockstore anyway
2021-07-05 11:32:52 +03:00
vyzo
839f7bd2b5 only occur check for DAGs 2021-07-05 11:11:08 +03:00
vyzo
c81ae5fc20 add some comments about the missing business and anothre log 2021-07-05 10:42:14 +03:00
vyzo
4c41f52828 add warning for missing objects for marking for debug purposes 2021-07-05 10:35:04 +03:00
vyzo
3597192d58 remove the sleeps and busy loop more times when waiting for missing objects 2021-07-05 10:31:47 +03:00
vyzo
1726eb993c deal with incomplete objects that need to be marked and protected
seems that something is writing DAGs before its consituents, which causes problems.
2021-07-05 10:22:52 +03:00
vyzo
db53859e7a reduce CompactionThreshold to 5 finalities
so that we run compaction every finality, once we've first compacted
2021-07-04 22:12:51 +03:00
vyzo
b08e0b7102 fix lint 2021-07-04 21:24:15 +03:00
vyzo
94efae419e reduce length of critical section
Just the purge; the rest is not critical -- e.g. it's ok if we do some duplicate copies
to the coldstore, we'll have gc soon.
2021-07-04 21:21:53 +03:00
vyzo
f33d4e79aa simplify transactional protection logic
Now that we delete objects heaviest first, we don't have to do deep walk and rescan gymnastics.
2021-07-04 20:49:39 +03:00
vyzo
40c271cda1 sort cold objects before deleting
so that we can't shoot ourselves in the foot by deleting the constituents of a DAG while it is
still in the hotstore.
2021-07-04 20:17:07 +03:00
vyzo
13d612f72f smarter trackTxnRefMany 2021-07-04 19:33:49 +03:00
vyzo
f124389b66 recursively protect all references 2021-07-04 19:21:00 +03:00
vyzo
4d286da593 fix error message 2021-07-04 18:58:39 +03:00
vyzo
680af8eb09 use deep object walking for more robust handling of transactional references 2021-07-04 18:38:28 +03:00
vyzo
1f02428225 fix lint 2021-07-04 18:38:28 +03:00
vyzo
2c7a89a1db short-circuit rescanning on block headers 2021-07-04 18:38:28 +03:00
vyzo
8e56fffb33 walkChain should visit the genesis state root 2021-07-04 18:38:28 +03:00
vyzo
190cb18ab0 housekeeping
- remove defunct tracking store implementations
- update splitstore node config
- use mark set type config option (defaulting to mapts); a memory constrained node
  may want to use an on-disk one
2021-07-04 18:38:28 +03:00
vyzo
19d1b1f532 deal with partially written objects 2021-07-04 18:38:28 +03:00
vyzo
0a1d7b3732 fix log 2021-07-04 18:38:28 +03:00
vyzo
eafffc1634 more efficient trackTxnRefMany 2021-07-04 18:38:28 +03:00
vyzo
36f93649ef fix panic from concurrent map writes in txnRefs 2021-07-04 18:38:28 +03:00
vyzo
6fa2cd232d simplify compaction model 2021-07-04 18:38:28 +03:00
vyzo
1f2b604c07 RIP tracking store 2021-07-04 18:38:28 +03:00
vyzo
68a83500bc fix bug that turned candidate filtering to dead code 2021-07-04 18:38:28 +03:00
vyzo
642f0e4740 deal with memory pressure, don't walk under the boundary 2021-07-04 18:38:28 +03:00
vyzo
c5cf8e226b remove unnecessary code 2021-07-04 18:38:28 +03:00
vyzo
d79e4da7aa more accurate stats about mark set updates 2021-07-04 18:38:28 +03:00
vyzo
6f58fdcb22 remove vm copy context detection hack
stack tracing is slow.
2021-07-04 18:38:28 +03:00
vyzo
2b03316cd9 fix log message 2021-07-04 18:38:28 +03:00
vyzo
184d3802b6 remove dead code 2021-07-04 18:38:28 +03:00
vyzo
228a435ba7 rework tracking logic; do it lazily and far more efficiently 2021-07-04 18:38:28 +03:00
vyzo
9d6cabd18a if it's not a dag, it's not a block 2021-07-04 18:38:28 +03:00
vyzo
8157f889ce short-circuit marking walks when encountering a block and more efficient walking 2021-07-04 18:38:28 +03:00
vyzo
736d6a3c19 only treat Has as an implicit write within vm.Copy context 2021-07-04 18:38:28 +03:00
vyzo
39723bbe60 use a single map for tracking pending writes, properly track implicits 2021-07-04 18:38:28 +03:00
vyzo
5834231e58 create the transactional protect filter before walking 2021-07-04 18:38:28 +03:00
vyzo
e4bb4be855 fix some residual purge races 2021-07-04 18:38:28 +03:00
vyzo
68bc5d2291 skip moving cold blocks when running with a noop coldstore
it is a noop but it still takes (a lot of) time because it has to read all the cold blocks.
2021-07-04 18:38:28 +03:00
vyzo
b87295db93 bubble up dependent txn ref errors
This cause Has to return false if it fails to traverse/protect all links, which would cause
the vm to recompute.
2021-07-04 18:38:28 +03:00
vyzo
637fbf6c5b fix faulty if/else logic for implicit txn protection 2021-07-04 18:38:28 +03:00
vyzo
9d6bcd7705 avoid clown shoes: only walk links for tracking in implicit writes/refs 2021-07-04 18:38:28 +03:00
vyzo
484dfaebce reused cidset across all walks when flushing pending writes 2021-07-04 18:38:28 +03:00
vyzo
1d41e1544a optimize transitive write tracking a bit 2021-07-04 18:38:28 +03:00
vyzo
da00fc66ee downgrade a couple of logs to warnings 2021-07-04 18:38:28 +03:00
vyzo
4071488ef2 first write, then track 2021-07-04 18:38:28 +03:00
vyzo
bd92c230da refactor txn reference tracking, do deep marking of DAGs 2021-07-04 18:38:28 +03:00