lotus/storage/pipeline/states_failed.go
Phi-rjan 6f7498b622
chore: Merge nv22 into master (#11699)
* [WIP] feat: Add nv22 skeleton

Addition of Network Version 22 skeleton

* update FFI

* feat: drand: refactor round verification

* feat: sealing: Support nv22 DDO features in the sealing pipeline (#11226)

* Initial work supporting DDO pieces in lotus-miner

* sealing: Update pipeline input to operate on UniversalPiece

* sealing: Update pipeline checks/sealing states to operate on UniversalPiece

* sealing: Make pipeline build with UniversalPiece

* move PieceDealInfo out of api

* make gen

* make sealing pipeline unit tests pass

* fix itest ensemble build

* don't panic in SectorsStatus with deals

* stop linter from complaining about checkPieces

* fix sector import tests

* mod tidy

* sealing: Add logic for (pre)committing DDO sectors

* sealing: state-types with method defs

* DDO non-snap pipeline works(?), DDO Itests

* DDO support in snapdeals pipeline

* make gen

* update actor bundles

* update the gst market fix

* fix: chain: use PreCommitSectorsBatch2 when setting up genesis

* some bug fixes

* integration working changes

* update actor bundles

* Make TestOnboardRawPieceSnap pass

* Appease the linter

* Make deadlines test pass with v12 actors

* Update go-state-types, abstract market DealState

* make gen

* mod tidy, lint fixes

* Fix some more tests

* Bump version in master

Bump version in master

* Make gen

Make gen

* fix sender

* fix: lotus-provider: Fix winning PoSt

* fix: sql Scan cannot write to an object

* Actually show miner-addrs in info-log

Actually show miner-addrs in lotus-provider info-log

* [WIP] feat: Add nv22 skeleton

Addition of Network Version 22 skeleton

* update FFI

* ddo is now nv22

* make gen

* temp actor bundle with ddo

* use working go-state-types

* gst with v13 market migration

* update bundle, builtin.MethodsMiner.ProveCommitSectors2 -> 3

* actually working v13 migration, v13 migration itest

* Address review

* sealing: Correct DDO snap pledge math

* itests: Mixed ddo itest

* pipeline: Fix sectorWeight

* sealing: convert market deals into PAMs in mixed sectors

* sealing: make market to ddo conversion work

* fix lint

* update gst

* Update actors and GST to lastest integ branch

* commit batcher: Update ProveCommitSectors3Params builder logic

* make gen

* use builtin-actors master

* ddo: address review

* itests: Add commd assertions to ddo tests

* make gen

* gst with fixed types

* config knobs for RequireActivationSuccess

* storage: Drop obsolete flaky tasts

---------

Co-authored-by: Jennifer Wang <jiayingw703@gmail.com>
Co-authored-by: Aayush <arajasek94@gmail.com>
Co-authored-by: Shrenuj Bansal <shrenuj.bansal@protocol.ai>
Co-authored-by: Phi <orjan.roren@gmail.com>
Co-authored-by: Andrew Jackson (Ajax) <snadrus@gmail.com>
Co-authored-by: TippyFlits <james.bluett@protocol.ai>

* feat: implement FIP-0063

* chore: deps: update to go-multiaddr v0.12.2 (#11602)

* feat: fvm: update the FVM/FFI to v4.1 (#11608) (#11612)

This:

1. Adds nv22 support.
2. Updates the message tracing format.

Co-authored-by: Steven Allen <steven@stebalien.com>

* AggregateProofType nil when doing batch updates

Use latest nv22 go-state-types version with matching update

* Update to v13.0.0-rc.2 bundle

* chore: Upgrade heights and codename

Update upgrade heights

Co-Authored-By: Steven Allen <steven@stebalien.com>

* Update epoch after nv22 DRAND switch

Update epoch after nv22 DRAND switch

* Update Mango codename to Phoneix

Make the codename for the Drand-change inline with Dragon style.

* Add UpgradePhoenixHeight to API params

* set UpgradePhoenixHeight to be one hour after Dragon

* Make gen

Make gen and UpgradePhoenixHeight in butterfly and local devnet to be in line with Calibration and Mainnet

* Update epoch heights (#11637)

Update epoch heights

* new: add forest bootstrap nodes (#11636)

Signed-off-by: samuelarogbonlo <sbayo971@gmail.com>

* Merge pull request #11491 from filecoin-project/fix/remove-decommissioned-pl-bootstrap-nodes

Remove PL operated bootstrap nodes from mainnet.pi

* feat: api: new verified registry methods to get all allocations and claims (#11631)

* new verireg methods

* update changelog and add itest

* update itest and cli

* update new method's support till v9

* remove gateway APIs

* fix cli internal var names

* chore:: backport #11609 to the feat/nv22 branch (#11644)

* feat: api: improve the correctness of Eth's trace_block (#11609)

* Improve the correctness of Eth's trace_block

- Improve encoding/decoding of parameters and return values:
  - Encode "native" parameters and return values with Solidity ABI.
  - Correctly decode parameters to "create" calls.
  - Use the correct (ish) output for "create" calls.
  - Handle all forms of "create".
- Make robust with respect to reverts:
  - Use the actor ID/address from the trace instead of looking it up in
    the state-tree (may not exist in the state-tree due to a revert).
  - Gracefully handle failed actor/contract creation.
- Improve performance:
  - We avoid looking anything up in the state-tree when translating the
    trace, which should significantly improve performance.
- Improve code readability:
  - Remove all "backtracking" logic.
  - Use an "environment" struct to store temporary state instead of
    attaching it to the trace.
- Fix random bugs:
  - Fix an allocation bug in the "address" logic (need to set the
    capacity before modifying the slice).
  - Improved error checking/handling.
- Use correct types for `trace_block` action/results (create, call, etc.).
  - And use the correct types for Result/Action structs instead of reusing the same "Call" action every time.
- Improve error messages.

* Make gen

Make gen

---------

Co-authored-by: Steven Allen <steven@stebalien.com>

* fix: add UpgradePhoenixHeight to StateGetNetworkParams (#11648)

* chore: deps: update to go-state-types v13.0.0-rc.1

* do NOT update the cache when running the real migration

* Merge pull request #11632 from hanabi1224/hm/drand-test

feat: drand quicknet: allow scheduling drand quicknet upgrade before nv22 on 2k devnet

* chore: deps: update to go-state-types v13.0.0-rc.2

chore: deps: update to go-state-types v13.0.0-rc.2

* feat: set migration config UpgradeEpoch for v13 actors upgrade

* Built-in actor events first draft

* itest for DDO non-market verified data w/ builtin actor events

* Tests for builtin actor events API

* Clean up DDO+Events tests, add lots of explainer comments

* Minor tweaks to events types

* Avoid duplicate messages when looking for receipts

* Rename internal events modules for clarity

* Adjust actor event API after review

* s/ActorEvents/Events/g in global config

* Manage event sending rate for SubscribeActorEvents

* Terminate SubscribeActorEvents chan when at max height

* Document future API changes

* More clarity in actor event API docs

* More post-review changes, lots of tests for SubscribeActorEvents

Use BlockDelay as the window for receiving events on the SubscribeActorEvents
channel. We expect the user to have received the initial batch of historical
events (if any) in one block's time. For real-time events we expect them to
not fall behind by roughly one block's time.

* Remove duplicate code from actor event type marshalling tests

Reduce verbosity and remove duplicate test logic from actor event types
JSON marshalling tests.

* Rename actor events test to follow go convention

Add missing `s` to `actor_events` test file to follow golang convention
used across the repo.

* Run actor events table tests in deterministic order

Refactor `map` usage for actor event table tests to ensure deterministic
test execution order, making debugging potential issues easier. If
non-determinism is a target, leverage Go's built-in parallel testing
capabilities.

* Reduce scope for filter removal failure when getting actor events

Use a fresh context to remove the temporary filter installed solely to
get the actor events. This should reduce chances of failure in a case
where the original context may be expired/cancelled.

Refactor removal into a `defer` statement for a more readable, concise
return statement.

* Use fixed RNG seed for actor event tests

Improve determinism in actor event tests by using a fixed RNG seed. This
makes up a more reproducible test suit.

* Use provided libraries to assert eventual conditions

Use the functionalities already provided by `testify` to assert eventual
conditions, and remove the use of `time.Sleep`.

Remove duplicate code in utility functions that are already defined.

Refactor assertion helper functions to use consistent terminology:
"require" implies fatal error, whereas "assert" implies error where the
test may proceed executing.

* Update changelog for actor events APIs

* Fix concerns and docs identified by review

* Update actor bundle to v13.0.0-rc3

Update actor bundle to v13.0.0-rc3

* Prep Lotus v1.26.0-rc1

- For sanity reverting the mainnet upgrade epoch to 99999999, and then only set it when cutting the final release

-Update Calibnet CIDs to v13.0.0-rc3

- Add GetActorEvents, SubscribeActorEvents, GetAllClaims and GetAllAllocations methods to the changelog

Co-Authored-By: Jiaying Wang <42981373+jennijuju@users.noreply.github.com>

* Update CHANGELOG.md

Co-authored-by: Masih H. Derkani <m@derkani.org>

* Make gen

Make gen

* fix: beacon: validate drand change at nv16 correctly

* bump to v1.26.0-rc2

* test: cleanup ddo verified itest, extract steps to functions

also add allocation-removed event case

* test: extract verified DDO test to separate file, add more checks

* test: add additional actor events checks

* Add verification for "deal-activated" actor event

* docs(drand): document the meaning of "IsChained" (#11692)

* Resolve conflicts

I encountered multiple issues when trying to run make gen. And these changes fixed a couple of them:
- go mod tidy
- Remove RaftState/RaftLeader
- Revert `if ts.Height() > claim.TermMax+claim.TermStart || !cctx.IsSet("expired")` to the what is in the release/v1.26.0: `if tsHeight > val.TermMax || !expired`

* fixup imports, make jen

* Update version

Update version in master to v1.27.0-dev

* Update node/impl/full/dummy.go

Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>

* Adjust ListClaimsCmd

Adjust ListClaimsCmd according to review

---------

Signed-off-by: samuelarogbonlo <sbayo971@gmail.com>
Co-authored-by: TippyFlits <james.bluett@protocol.ai>
Co-authored-by: Aayush <arajasek94@gmail.com>
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
Co-authored-by: Jennifer Wang <jiayingw703@gmail.com>
Co-authored-by: Shrenuj Bansal <shrenuj.bansal@protocol.ai>
Co-authored-by: Andrew Jackson (Ajax) <snadrus@gmail.com>
Co-authored-by: Steven Allen <steven@stebalien.com>
Co-authored-by: Rod Vagg <rod@vagg.org>
Co-authored-by: Samuel Arogbonlo <47984109+samuelarogbonlo@users.noreply.github.com>
Co-authored-by: LexLuthr <88259624+LexLuthr@users.noreply.github.com>
Co-authored-by: tom123222 <160735201+tom123222@users.noreply.github.com>
Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com>
Co-authored-by: Masih H. Derkani <m@derkani.org>
Co-authored-by: Jiaying Wang <42981373+jennijuju@users.noreply.github.com>
2024-03-12 10:33:58 +01:00

620 lines
20 KiB
Go

package sealing
import (
"context"
"time"
"github.com/hashicorp/go-multierror"
"golang.org/x/xerrors"
"github.com/filecoin-project/go-address"
"github.com/filecoin-project/go-commp-utils/zerocomm"
"github.com/filecoin-project/go-state-types/abi"
"github.com/filecoin-project/go-state-types/exitcode"
"github.com/filecoin-project/go-statemachine"
"github.com/filecoin-project/lotus/api"
"github.com/filecoin-project/lotus/chain/actors/builtin/market"
"github.com/filecoin-project/lotus/chain/actors/builtin/miner"
"github.com/filecoin-project/lotus/chain/types"
"github.com/filecoin-project/lotus/storage/sealer/storiface"
)
var MinRetryTime = 1 * time.Minute
func failedCooldown(ctx statemachine.Context, sector SectorInfo) error {
// TODO: Exponential backoff when we see consecutive failures
retryStart := time.Unix(int64(sector.Log[len(sector.Log)-1].Timestamp), 0).Add(MinRetryTime)
if len(sector.Log) > 0 && !time.Now().After(retryStart) {
log.Infof("%s(%d), waiting %s before retrying", sector.State, sector.SectorNumber, time.Until(retryStart))
select {
case <-time.After(time.Until(retryStart)):
case <-ctx.Context().Done():
return ctx.Context().Err()
}
}
return nil
}
func (m *Sealing) checkPreCommitted(ctx statemachine.Context, sector SectorInfo) (*miner.SectorPreCommitOnChainInfo, bool) {
ts, err := m.Api.ChainHead(ctx.Context())
if err != nil {
log.Errorf("handleSealPrecommit1Failed(%d): temp error: %+v", sector.SectorNumber, err)
return nil, false
}
info, err := m.Api.StateSectorPreCommitInfo(ctx.Context(), m.maddr, sector.SectorNumber, ts.Key())
if err != nil {
log.Errorf("handleSealPrecommit1Failed(%d): temp error: %+v", sector.SectorNumber, err)
return nil, false
}
return info, true
}
var MaxPreCommit1Retries = uint64(3)
func (m *Sealing) handleSealPrecommit1Failed(ctx statemachine.Context, sector SectorInfo) error {
if sector.PreCommit1Fails > MaxPreCommit1Retries {
return ctx.Send(SectorRemove{})
}
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetrySealPreCommit1{})
}
func (m *Sealing) handleSealPrecommit2Failed(ctx statemachine.Context, sector SectorInfo) error {
if err := failedCooldown(ctx, sector); err != nil {
return err
}
if sector.PreCommit2Fails > 3 {
return ctx.Send(SectorRetrySealPreCommit1{})
}
return ctx.Send(SectorRetrySealPreCommit2{})
}
func (m *Sealing) handlePreCommitFailed(ctx statemachine.Context, sector SectorInfo) error {
ts, err := m.Api.ChainHead(ctx.Context())
if err != nil {
log.Errorf("handlePreCommitFailed: api error, not proceeding: %+v", err)
return nil
}
if sector.PreCommitMessage != nil {
mw, err := m.Api.StateSearchMsg(ctx.Context(), ts.Key(), *sector.PreCommitMessage, api.LookbackNoLimit, true)
if err != nil {
// API error
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryPreCommitWait{})
}
if mw == nil {
// API error in precommit
return ctx.Send(SectorRetryPreCommitWait{})
}
switch mw.Receipt.ExitCode {
case exitcode.Ok:
// API error in PreCommitWait
return ctx.Send(SectorRetryPreCommitWait{})
case exitcode.SysErrOutOfGas:
// API error in PreCommitWait AND gas estimator guessed a wrong number in PreCommit
return ctx.Send(SectorRetryPreCommit{})
default:
// something else went wrong
}
}
if err := checkPrecommit(ctx.Context(), m.Address(), sector, ts.Key(), ts.Height(), m.Api); err != nil {
switch err.(type) {
case *ErrApi:
log.Errorf("handlePreCommitFailed: api error, not proceeding: %+v", err)
return nil
case *ErrBadCommD: // TODO: Should this just back to packing? (not really needed since handlePreCommit1 will do that too)
return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("bad CommD error: %w", err)})
case *ErrExpiredTicket:
return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("ticket expired error: %w", err)})
case *ErrBadTicket:
return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("bad expired: %w", err)})
case *ErrInvalidDeals:
log.Warnf("invalid deals in sector %d: %v", sector.SectorNumber, err)
return ctx.Send(SectorInvalidDealIDs{Return: RetPreCommitFailed})
case *ErrExpiredDeals:
return ctx.Send(SectorDealsExpired{xerrors.Errorf("sector deals expired: %w", err)})
case *ErrNoPrecommit:
return ctx.Send(SectorRetryPreCommit{})
case *ErrPrecommitOnChain:
// noop
case *ErrSectorNumberAllocated:
log.Errorf("handlePreCommitFailed: sector number already allocated, not proceeding: %+v", err)
// TODO: check if the sector is committed (not sure how we'd end up here)
// TODO: check on-chain state, adjust local sector number counter to not give out allocated numbers
return nil
default:
return xerrors.Errorf("checkPrecommit sanity check error: %w", err)
}
}
if pci, is := m.checkPreCommitted(ctx, sector); is && pci != nil {
if sector.PreCommitMessage == nil {
log.Warnf("sector %d is precommitted on chain, but we don't have precommit message", sector.SectorNumber)
return ctx.Send(SectorPreCommitLanded{TipSet: ts.Key()})
}
if pci.Info.SealedCID != *sector.CommR {
log.Warnf("sector %d is precommitted on chain, with different CommR: %s != %s", sector.SectorNumber, pci.Info.SealedCID, sector.CommR)
return nil // TODO: remove when the actor allows re-precommit
}
// TODO: we could compare more things, but I don't think we really need to
// CommR tells us that CommD (and CommPs), and the ticket are all matching
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryWaitSeed{})
}
if sector.PreCommitMessage != nil {
log.Warn("retrying precommit even though the message failed to apply")
}
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryPreCommit{})
}
func (m *Sealing) handleComputeProofFailed(ctx statemachine.Context, sector SectorInfo) error {
// TODO: Check sector files
if err := failedCooldown(ctx, sector); err != nil {
return err
}
if sector.InvalidProofs > 1 {
return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("consecutive compute fails")})
}
return ctx.Send(SectorRetryComputeProof{})
}
func (m *Sealing) handleRemoteCommitFailed(ctx statemachine.Context, sector SectorInfo) error {
if err := failedCooldown(ctx, sector); err != nil {
return err
}
if sector.InvalidProofs > 1 {
log.Errorw("consecutive remote commit fails", "sector", sector.SectorNumber, "c1url", sector.RemoteCommit1Endpoint, "c2url", sector.RemoteCommit2Endpoint)
}
return ctx.Send(SectorRetryComputeProof{})
}
func (m *Sealing) handleSubmitReplicaUpdateFailed(ctx statemachine.Context, sector SectorInfo) error {
if err := failedCooldown(ctx, sector); err != nil {
return err
}
if sector.ReplicaUpdateMessage != nil {
mw, err := m.Api.StateSearchMsg(ctx.Context(), types.EmptyTSK, *sector.ReplicaUpdateMessage, api.LookbackNoLimit, true)
if err != nil {
// API error
return ctx.Send(SectorRetrySubmitReplicaUpdateWait{})
}
if mw == nil {
return ctx.Send(SectorRetrySubmitReplicaUpdateWait{})
}
switch mw.Receipt.ExitCode {
case exitcode.Ok:
return ctx.Send(SectorRetrySubmitReplicaUpdateWait{})
case exitcode.SysErrOutOfGas:
return ctx.Send(SectorRetrySubmitReplicaUpdate{})
default:
// something else went wrong
}
}
ts, err := m.Api.ChainHead(ctx.Context())
if err != nil {
log.Errorf("handleSubmitReplicaUpdateFailed: api error, not proceeding: %+v", err)
return nil
}
if err := checkReplicaUpdate(ctx.Context(), m.maddr, sector, m.Api); err != nil {
switch err.(type) {
case *ErrApi:
log.Errorf("handleSubmitReplicaUpdateFailed: api error, not proceeding: %+v", err)
return nil
case *ErrBadRU:
log.Errorf("bad replica update: %+v", err)
return ctx.Send(SectorRetryReplicaUpdate{})
case *ErrBadPR:
log.Errorf("bad PR1: +%v", err)
return ctx.Send(SectorRetryProveReplicaUpdate{})
case *ErrInvalidDeals:
return ctx.Send(SectorInvalidDealIDs{})
case *ErrExpiredDeals:
return ctx.Send(SectorDealsExpired{xerrors.Errorf("expired dealIDs in sector: %w", err)})
default:
log.Errorf("sanity check error, not proceeding: +%v", err)
return xerrors.Errorf("checkReplica sanity check error: %w", err)
}
}
// Abort upgrade for sectors that went faulty since being marked for upgrade
active, err := m.sectorActive(ctx.Context(), ts.Key(), sector.SectorNumber)
if err != nil {
log.Errorf("sector active check: api error, not proceeding: %+v", err)
return nil
}
if !active {
err := xerrors.Errorf("sector marked for upgrade %d no longer active, aborting upgrade", sector.SectorNumber)
log.Errorf("%s", err)
return ctx.Send(SectorAbortUpgrade{err})
}
return ctx.Send(SectorRetrySubmitReplicaUpdate{})
}
func (m *Sealing) handleReleaseSectorKeyFailed(ctx statemachine.Context, sector SectorInfo) error {
// not much we can do, wait for a bit and try again
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorUpdateActive{})
}
func (m *Sealing) handleCommitFailed(ctx statemachine.Context, sector SectorInfo) error {
ts, err := m.Api.ChainHead(ctx.Context())
if err != nil {
log.Errorf("handleCommitting: api error, not proceeding: %+v", err)
return nil
}
if sector.CommitMessage != nil {
mw, err := m.Api.StateSearchMsg(ctx.Context(), ts.Key(), *sector.CommitMessage, api.LookbackNoLimit, true)
if err != nil {
// API error
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryCommitWait{})
}
if mw == nil {
// API error in commit
return ctx.Send(SectorRetryCommitWait{})
}
switch mw.Receipt.ExitCode {
case exitcode.Ok:
si, err := m.Api.StateSectorGetInfo(ctx.Context(), m.maddr, sector.SectorNumber, mw.TipSet)
if err != nil {
// API error
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryCommitWait{})
}
if si != nil {
// API error in CommitWait?
return ctx.Send(SectorRetryCommitWait{})
}
// if si == nil, something else went wrong; Likely expired deals, we'll
// find out in checkCommit
case exitcode.SysErrOutOfGas:
// API error in CommitWait AND gas estimator guessed a wrong number in SubmitCommit
return ctx.Send(SectorRetrySubmitCommit{})
default:
// something else went wrong
}
}
if err := m.checkCommit(ctx.Context(), sector, sector.Proof, ts.Key()); err != nil {
switch err.(type) {
case *ErrApi:
log.Errorf("handleCommitFailed: api error, not proceeding: %+v", err)
return nil
case *ErrBadSeed:
log.Errorf("seed changed, will retry: %+v", err)
return ctx.Send(SectorRetryWaitSeed{})
case *ErrInvalidProof:
if err := failedCooldown(ctx, sector); err != nil {
return err
}
if sector.InvalidProofs > 0 {
return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("consecutive invalid proofs")})
}
return ctx.Send(SectorRetryInvalidProof{})
case *ErrPrecommitOnChain:
log.Errorf("no precommit on chain, will retry: %+v", err)
return ctx.Send(SectorRetryPreCommitWait{})
case *ErrNoPrecommit:
return ctx.Send(SectorRetryPreCommit{})
case *ErrInvalidDeals:
log.Warnf("invalid deals in sector %d: %v", sector.SectorNumber, err)
return ctx.Send(SectorInvalidDealIDs{Return: RetCommitFailed})
case *ErrExpiredDeals:
return ctx.Send(SectorDealsExpired{xerrors.Errorf("sector deals expired: %w", err)})
case *ErrCommitWaitFailed:
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryCommitWait{})
default:
return xerrors.Errorf("checkCommit sanity check error (%T): %w", err, err)
}
}
// TODO: Check sector files
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryComputeProof{})
}
func (m *Sealing) handleFinalizeFailed(ctx statemachine.Context, sector SectorInfo) error {
// TODO: Check sector files
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRetryFinalize{})
}
func (m *Sealing) handleRemoveFailed(ctx statemachine.Context, sector SectorInfo) error {
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorRemove{})
}
func (m *Sealing) handleTerminateFailed(ctx statemachine.Context, sector SectorInfo) error {
// ignoring error as it's most likely an API error - `pci` will be nil, and we'll go back to
// the Terminating state after cooldown. If the API is still failing, well get back to here
// with the error in SectorInfo log.
pci, _ := m.Api.StateSectorPreCommitInfo(ctx.Context(), m.maddr, sector.SectorNumber, types.EmptyTSK)
if pci != nil {
return nil // pause the fsm, needs manual user action
}
if err := failedCooldown(ctx, sector); err != nil {
return err
}
return ctx.Send(SectorTerminate{})
}
func (m *Sealing) handleDealsExpired(ctx statemachine.Context, sector SectorInfo) error {
// First make vary sure the sector isn't committed
si, err := m.Api.StateSectorGetInfo(ctx.Context(), m.maddr, sector.SectorNumber, types.EmptyTSK)
if err != nil {
return xerrors.Errorf("getting sector info: %w", err)
}
if si != nil {
// TODO: this should never happen, but in case it does, try to go back to
// the proving state after running some checks
return xerrors.Errorf("sector is committed on-chain, but we're in DealsExpired")
}
// Not much to do here, we can't go back in time to commit this sector
return ctx.Send(SectorRemove{})
}
func (m *Sealing) handleDealsExpiredSnapDeals(ctx statemachine.Context, sector SectorInfo) error {
if !sector.CCUpdate {
// Should be impossible
return xerrors.Errorf("should never reach SnapDealsDealsExpired as a non-CCUpdate sector")
}
return ctx.Send(SectorAbortUpgrade{xerrors.Errorf("one of upgrade deals expired")})
}
func (m *Sealing) handleAbortUpgrade(ctx statemachine.Context, sector SectorInfo) error {
if !sector.CCUpdate {
return xerrors.Errorf("should never reach AbortUpgrade as a non-CCUpdate sector")
}
m.cleanupAssignedDeals(sector)
// Remove snap deals replica if any
// This removes update / update-cache from all storage
if err := m.sealer.ReleaseReplicaUpgrade(ctx.Context(), m.minerSector(sector.SectorType, sector.SectorNumber)); err != nil {
return xerrors.Errorf("removing CC update files from sector storage")
}
// This removes the unsealed file from all storage
// note: we're not keeping anything unsealed because we're reverting to CC
if err := m.sealer.ReleaseUnsealed(ctx.Context(), m.minerSector(sector.SectorType, sector.SectorNumber), []storiface.Range{}); err != nil {
log.Error(err)
}
// and makes sure sealed/cache files only exist in long-term-storage
if err := m.sealer.FinalizeSector(ctx.Context(), m.minerSector(sector.SectorType, sector.SectorNumber)); err != nil {
log.Error(err)
}
return ctx.Send(SectorRevertUpgradeToProving{})
}
// failWith is a mutator or global mutator
func (m *Sealing) handleRecoverDealIDsOrFailWith(ctx statemachine.Context, sector SectorInfo, failWith interface{}) error {
toFix, nonBuiltinMarketPieces, err := recoveryPiecesToFix(ctx.Context(), m.Api, sector, m.maddr)
if err != nil {
return err
}
ts, err := m.Api.ChainHead(ctx.Context())
if err != nil {
return err
}
failed := map[int]error{}
updates := map[int]abi.DealID{}
for _, i := range toFix {
// note: all toFix pieces are builtin-market pieces
p := sector.Pieces[i]
if p.Impl().PublishCid == nil {
// TODO: check if we are in an early enough state try to remove this piece
log.Errorf("can't fix sector deals: piece %d (of %d) of sector %d has nil DealInfo.PublishCid (refers to deal %d)", i, len(sector.Pieces), sector.SectorNumber, p.Impl().DealID)
// Not much to do here (and this can only happen for old spacerace sectors)
return ctx.Send(failWith)
}
var dp *market.DealProposal
if p.Impl().DealProposal != nil {
mdp := *p.Impl().DealProposal
dp = &mdp
}
res, err := m.DealInfo.GetCurrentDealInfo(ctx.Context(), ts.Key(), dp, *p.Impl().PublishCid)
if err != nil {
failed[i] = xerrors.Errorf("getting current deal info for piece %d: %w", i, err)
continue
}
if res.MarketDeal == nil {
failed[i] = xerrors.Errorf("nil market deal (%d,%d,%d,%s)", i, sector.SectorNumber, p.Impl().DealID, p.Impl().DealProposal.PieceCID)
continue
}
if res.MarketDeal.Proposal.PieceCID != p.PieceCID() {
failed[i] = xerrors.Errorf("recovered piece (%d) deal in sector %d (dealid %d) has different PieceCID %s != %s", i, sector.SectorNumber, p.Impl().DealID, p.Impl().DealProposal.PieceCID, res.MarketDeal.Proposal.PieceCID)
continue
}
updates[i] = res.DealID
}
if len(failed) > 0 {
var merr error
for _, e := range failed {
merr = multierror.Append(merr, e)
}
if len(failed)+nonBuiltinMarketPieces == len(sector.Pieces) {
log.Errorf("removing sector %d: all deals expired or unrecoverable: %+v", sector.SectorNumber, merr)
return ctx.Send(failWith)
}
// todo: try to remove bad pieces (hard; see the todo above)
// for now removing sectors is probably better than having them stuck in RecoverDealIDs
// and expire anyways
log.Errorf("removing sector %d: deals expired or unrecoverable: %+v", sector.SectorNumber, merr)
return ctx.Send(failWith)
}
// Not much to do here, we can't go back in time to commit this sector
return ctx.Send(SectorUpdateDealIDs{Updates: updates})
}
func (m *Sealing) HandleRecoverDealIDs(ctx statemachine.Context, sector SectorInfo) error {
return m.handleRecoverDealIDsOrFailWith(ctx, sector, SectorRemove{})
}
func (m *Sealing) handleSnapDealsRecoverDealIDs(ctx statemachine.Context, sector SectorInfo) error {
return m.handleRecoverDealIDsOrFailWith(ctx, sector, SectorAbortUpgrade{xerrors.New("failed recovering deal ids")})
}
// recoveryPiecesToFix returns the list of sector piece indexes to fix, and the number of non-builtin-market pieces
func recoveryPiecesToFix(ctx context.Context, api SealingAPI, sector SectorInfo, maddr address.Address) ([]int, int, error) {
ts, err := api.ChainHead(ctx)
if err != nil {
return nil, 0, xerrors.Errorf("getting chain head: %w", err)
}
var toFix []int
nonBuiltinMarketPieces := 0
for i, p := range sector.Pieces {
i, p := i, p
err := p.handleDealInfo(handleDealInfoParams{
FillerHandler: func(info UniversalPieceInfo) error {
// if no deal is associated with the piece, ensure that we added it as
// filler (i.e. ensure that it has a zero PieceCID)
exp := zerocomm.ZeroPieceCommitment(p.Piece().Size.Unpadded())
if !info.PieceCID().Equals(exp) {
return xerrors.Errorf("sector %d piece %d had non-zero PieceCID %+v", sector.SectorNumber, i, p.Piece().PieceCID)
}
nonBuiltinMarketPieces++
return nil
},
BuiltinMarketHandler: func(info UniversalPieceInfo) error {
deal, err := api.StateMarketStorageDeal(ctx, p.DealInfo().Impl().DealID, ts.Key())
if err != nil {
log.Warnf("getting deal %d for piece %d: %+v", p.DealInfo().Impl().DealID, i, err)
toFix = append(toFix, i)
return nil
}
if deal.Proposal.Provider != maddr {
log.Warnf("piece %d (of %d) of sector %d refers deal %d with wrong provider: %s != %s", i, len(sector.Pieces), sector.SectorNumber, p.Impl().DealID, deal.Proposal.Provider, maddr)
toFix = append(toFix, i)
return nil
}
if deal.Proposal.PieceCID != p.Piece().PieceCID {
log.Warnf("piece %d (of %d) of sector %d refers deal %d with wrong PieceCID: %s != %s", i, len(sector.Pieces), sector.SectorNumber, p.Impl().DealID, p.Piece().PieceCID, deal.Proposal.PieceCID)
toFix = append(toFix, i)
return nil
}
if p.Piece().Size != deal.Proposal.PieceSize {
log.Warnf("piece %d (of %d) of sector %d refers deal %d with different size: %d != %d", i, len(sector.Pieces), sector.SectorNumber, p.Impl().DealID, p.Piece().Size, deal.Proposal.PieceSize)
toFix = append(toFix, i)
return nil
}
if ts.Height() >= deal.Proposal.StartEpoch {
// TODO: check if we are in an early enough state (before precommit), try to remove the offending pieces
// (tricky as we have to 'defragment' the sector while doing that, and update piece references for retrieval)
return xerrors.Errorf("can't fix sector deals: piece %d (of %d) of sector %d refers expired deal %d - should start at %d, head %d", i, len(sector.Pieces), sector.SectorNumber, p.Impl().DealID, deal.Proposal.StartEpoch, ts.Height())
}
return nil
},
DDOHandler: func(info UniversalPieceInfo) error {
// DDO pieces have no repair strategy
nonBuiltinMarketPieces++
return nil
},
})
if err != nil {
return nil, 0, xerrors.Errorf("checking piece %d: %w", i, err)
}
}
return toFix, nonBuiltinMarketPieces, nil
}