This PR introduces as sharded mutex within the ChainIndex#GetTipsetByHeight.
It also replaces a go map with xsync.Map which doesn't require locking.
The lock is taken when it appears that ChainIndex filling work should be
started. After claiming the lock, the status of the cache is rechecked,
if the entry is still missing, the fillCache is started.
Thanks to @snissn and @arajasek for debugging and taking initial stabs at this.
Supersedes #10866 and 10885
Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
fix: types: error out on decoding BlockMsg with extraneous data
Fixes OSS-fuzz issue 48208: lotus:fuzz_block_msg
Signed-off-by: Yolan Romailler <anomalroil@users.noreply.github.com>
* release the read lock earlier as it is not needed for chaincomputebasefee
* chain/messagepool/selection.go change to read lock in SelectMessages
* tighten up locks in chain/messagepool/repub.go and two questions on whether curTsLks are needed as comments
* include suggestion from @Jorropo to preallocate our msgs array so that we only need to make a single allocation
* mp.pending should not be accessed directly but through the getter
* from @arajasek: just check whether the sender is a robust address (anything except an ID address is robust) here, and return if so. That will:
be faster
reduce the size of this cache by half, because we can drop mp.keyCache.Add(ka, ka) on line 491.
* do not need curTslk and clean up code comments
This reverts commit 8b2208fd9a, reversing
changes made to 2db6b12b78.
Unfortunately, this is rather tricky code. We've found several issues so
far and, while we've fixed a few, there are outstanding issues that
would require complex fixes we don't have time to tackle right now.
Luckily, this code isn't actually needed by the main Filecoin chain
which relies on consensus fault reporting to handle equivocation. So we
can just try again later.
* fix: sync: fail sync instead of logging if we sync the wrong chain
* fix: sync: write headers in the correct order
Just in case. This shouldn't be necessary, but we might as well.
* fix: minus minus
* fix: do put the tipset
Put != Persist
And fix the message to account for the fact that we now reject _old_
blocks along with new ones.
We frequently receive "out of date" blocks in hello messages from
syncing and/or out of sync nodes. This isn't an error.
This will reject blocks in pubsub validation if they're either:
1. Too far into the future (5 blocks beyond the expected head).
2. Too far into the past (before finality with respect to our current
head).
Specifically:
1. We were previously rejecting future blocks in the sync logic, but not
in pubsub itself.
2. We never used to check if a block was too _old_.
Motivation: Blocks that are too new/too old can cause us to perform
quite a bit of unnecessary work.
We have to save raw blocks to the snapshot, but we should not be scanning them
for additional links as if they were CBOR blocks.
This cleans the logic a bit (we were checking that the parent was a CBOR block
before queueing up the children, but then scanning the children... it was weird).
Additionally, more verbose logging is added for the next time ScanForLinks
fails (currently very little info was given).
Our ScanForLinks callback should only enqueue CBOR for further processing.
We have observed that EthGetTransactionCount is one of the hotspots
on Glif production notes, and we are seeing regular 10-20 second
latencies when calling this rpc method.
I tracked the high latency spikes and they were correlated when
we were running ExecuteTipSet while following the chain.
To address this, we should not rely on tipset computation to get
nounce and instead look at the parent tipset and then count the
messages sent from the 'addr'.
The function/parameter were poorly named and should never have been
exposed. "GC" confidence should always be the same, this parameter
doesn't let us actually set the _confidence_, just the point before
which we no longer support reverts.
fixes#10706
Technically, the block validator caught this panic. But it's pointless
because we have a _real_ mechanism to return the validation reason,
which we should have been using.
In general, panicing like this is a very bad idea because it's
non-obvious and, in this case, completely undocumented.
* have gas estimate call callInternal with applyTsMessages = false and other calls with applyTsMessages=true for gas caclulation optimization
* set applyTsMessages = true in CallWithGas call in shed
* update test with new callwithgas api optimization for eth call
* Update chain/stmgr/call.go
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
* env flag LOTUS_SKIP_APPLY_TS_MESSAGE_CALL_WITH_GAS must be 1 in order to have applyTsMessages change
* env flag LOTUS_SKIP_APPLY_TS_MESSAGE_CALL_WITH_GAS must be 1 in order to have applyTsMessages change
* make sure that even if we arent apply ts messages we grab ts messages from the particular user who is requesting gas estimation
---------
Co-authored-by: Jiaying Wang <42981373+jennijuju@users.noreply.github.com>
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
Co-authored-by: Ubuntu <ubuntu@ip-10-0-4-29.us-east-2.compute.internal>
We want to make the execution trace cache size configurable as SPs
may want to disable it while exchanges may want to crank it up.
We were also are going with intuition for this value, so having
ability to change it without a new build would help.
Fixes: https://github.com/filecoin-project/lotus/issues/10584
This PR adds a small cache to calls to ExecutionTrace which helps
improve performance for node operators like exchanges and block
explorers.
If items is in cache calls to this function will be 2-3x faster.
Fixes: https://github.com/filecoin-project/lotus/issues/10504