lotus/markets/dagstore/wrapper_test.go

216 lines
5.4 KiB
Go
Raw Normal View History

integrate DAG store and CARv2 in deal-making (#6671) This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com> Co-authored-by: Raúl Kripalani <raul@protocol.ai> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
2021-08-16 22:34:32 +00:00
package dagstore
import (
"bytes"
"context"
"io"
"os"
"testing"
"time"
2021-11-26 17:40:27 +00:00
"github.com/ipfs/go-cid"
"github.com/stretchr/testify/require"
integrate DAG store and CARv2 in deal-making (#6671) This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com> Co-authored-by: Raúl Kripalani <raul@protocol.ai> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
2021-08-16 22:34:32 +00:00
"golang.org/x/xerrors"
2021-11-26 17:40:27 +00:00
"github.com/filecoin-project/go-state-types/abi"
integrate DAG store and CARv2 in deal-making (#6671) This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com> Co-authored-by: Raúl Kripalani <raul@protocol.ai> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
2021-08-16 22:34:32 +00:00
"github.com/filecoin-project/dagstore"
"github.com/filecoin-project/dagstore/mount"
"github.com/filecoin-project/dagstore/shard"
2021-11-26 17:40:27 +00:00
"github.com/filecoin-project/lotus/node/config"
integrate DAG store and CARv2 in deal-making (#6671) This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com> Co-authored-by: Raúl Kripalani <raul@protocol.ai> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
2021-08-16 22:34:32 +00:00
)
// TestWrapperAcquireRecovery verifies that if acquire shard returns a "not found"
// error, the wrapper will attempt to register the shard then reacquire
func TestWrapperAcquireRecovery(t *testing.T) {
ctx := context.Background()
pieceCid, err := cid.Parse("bafkqaaa")
require.NoError(t, err)
// Create a DAG store wrapper
dagst, w, err := NewDAGStore(config.DAGStoreConfig{
RootDir: t.TempDir(),
GCInterval: config.Duration(1 * time.Millisecond),
}, mockLotusMount{})
require.NoError(t, err)
defer dagst.Close() //nolint:errcheck
// Return an error from acquire shard the first time
acquireShardErr := make(chan error, 1)
acquireShardErr <- xerrors.Errorf("unknown shard: %w", dagstore.ErrShardUnknown)
// Create a mock DAG store in place of the real DAG store
mock := &mockDagStore{
acquireShardErr: acquireShardErr,
acquireShardRes: dagstore.ShardResult{
Accessor: getShardAccessor(t),
},
register: make(chan shard.Key, 1),
}
w.dagst = mock
mybs, err := w.LoadShard(ctx, pieceCid)
require.NoError(t, err)
// Expect the wrapper to try to recover from the error returned from
// acquire shard by calling register shard with the same key
tctx, cancel := context.WithTimeout(ctx, time.Second)
defer cancel()
select {
case <-tctx.Done():
require.Fail(t, "failed to call register")
case k := <-mock.register:
require.Equal(t, k.String(), pieceCid.String())
}
// Verify that we can get things from the acquired blockstore
var count int
ch, err := mybs.AllKeysChan(ctx)
require.NoError(t, err)
for range ch {
count++
}
require.Greater(t, count, 0)
}
// TestWrapperBackground verifies the behaviour of the background go routine
func TestWrapperBackground(t *testing.T) {
ctx := context.Background()
// Create a DAG store wrapper
dagst, w, err := NewDAGStore(config.DAGStoreConfig{
RootDir: t.TempDir(),
GCInterval: config.Duration(1 * time.Millisecond),
}, mockLotusMount{})
require.NoError(t, err)
defer dagst.Close() //nolint:errcheck
// Create a mock DAG store in place of the real DAG store
mock := &mockDagStore{
gc: make(chan struct{}, 1),
recover: make(chan shard.Key, 1),
close: make(chan struct{}, 1),
}
w.dagst = mock
// Start up the wrapper
err = w.Start(ctx)
require.NoError(t, err)
// Expect GC to be called automatically
tctx, cancel := context.WithTimeout(ctx, time.Second)
defer cancel()
select {
case <-tctx.Done():
require.Fail(t, "failed to call GC")
case <-mock.gc:
}
// Expect that when the wrapper is closed it will call close on the
// DAG store
err = w.Close()
require.NoError(t, err)
tctx, cancel3 := context.WithTimeout(ctx, time.Second)
defer cancel3()
select {
case <-tctx.Done():
require.Fail(t, "failed to call close")
case <-mock.close:
}
}
type mockDagStore struct {
acquireShardErr chan error
acquireShardRes dagstore.ShardResult
register chan shard.Key
gc chan struct{}
recover chan shard.Key
close chan struct{}
}
func (m *mockDagStore) DestroyShard(ctx context.Context, key shard.Key, out chan dagstore.ShardResult, _ dagstore.DestroyOpts) error {
panic("implement me")
}
func (m *mockDagStore) GetShardInfo(k shard.Key) (dagstore.ShardInfo, error) {
panic("implement me")
}
func (m *mockDagStore) AllShardsInfo() dagstore.AllShardsInfo {
panic("implement me")
}
func (m *mockDagStore) Start(_ context.Context) error {
return nil
}
func (m *mockDagStore) RegisterShard(ctx context.Context, key shard.Key, mnt mount.Mount, out chan dagstore.ShardResult, opts dagstore.RegisterOpts) error {
m.register <- key
out <- dagstore.ShardResult{Key: key}
return nil
}
func (m *mockDagStore) AcquireShard(ctx context.Context, key shard.Key, out chan dagstore.ShardResult, _ dagstore.AcquireOpts) error {
select {
case err := <-m.acquireShardErr:
return err
default:
}
out <- m.acquireShardRes
return nil
}
func (m *mockDagStore) RecoverShard(ctx context.Context, key shard.Key, out chan dagstore.ShardResult, _ dagstore.RecoverOpts) error {
m.recover <- key
return nil
}
func (m *mockDagStore) GC(ctx context.Context) (*dagstore.GCResult, error) {
select {
case m.gc <- struct{}{}:
default:
}
return nil, nil
}
func (m *mockDagStore) Close() error {
m.close <- struct{}{}
return nil
}
type mockLotusMount struct {
}
func (m mockLotusMount) Start(ctx context.Context) error {
return nil
}
func (m mockLotusMount) FetchUnsealedPiece(ctx context.Context, pieceCid cid.Cid, offset uint64) (io.ReadCloser, abi.UnpaddedPieceSize, error) {
integrate DAG store and CARv2 in deal-making (#6671) This commit removes badger from the deal-making processes, and moves to a new architecture with the dagstore as the cental component on the miner-side, and CARv2s on the client-side. Every deal that has been handed off to the sealing subsystem becomes a shard in the dagstore. Shards are mounted via the LotusMount, which teaches the dagstore how to load the related piece when serving retrievals. When the miner starts the Lotus for the first time with this patch, we will perform a one-time migration of all active deals into the dagstore. This is a lightweight process, and it consists simply of registering the shards in the dagstore. Shards are backed by the unsealed copy of the piece. This is currently a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so when it's time to acquire a shard to serve a retrieval, the unsealed CARv1 is joined with its index (safeguarded by the dagstore), to form a read-only blockstore, thus taking the place of the monolithic badger. Data transfers have been adjusted to interface directly with CARv2 files. On inbound transfers (client retrievals, miner storage deals), we stream the received data into a CARv2 ReadWrite blockstore. On outbound transfers (client storage deals, miner retrievals), we serve the data off a CARv2 ReadOnly blockstore. Client-side imports are managed by the refactored *imports.Manager component (when not using IPFS integration). Just like it before, we use the go-filestore library to avoid duplicating the data from the original file in the resulting UnixFS DAG (concretely the leaves). However, the target of those imports are what we call "ref-CARv2s": CARv2 files placed under the `$LOTUS_PATH/imports` directory, containing the intermediate nodes in full, and the leaves as positional references to the original file on disk. Client-side retrievals are placed into CARv2 files in the location: `$LOTUS_PATH/retrievals`. A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore` subcommands have been introduced on the miner-side to inspect and manage the dagstore. Despite moving to a CARv2-backed system, the IPFS integration has been respected, and it continues to be possible to make storage deals with data held in an IPFS node, and to perform retrievals directly into an IPFS node. NOTE: because the "staging" and "client" Badger blockstores are no longer used, existing imports on the client will be rendered useless. On startup, Lotus will enumerate all imports and print WARN statements on the log for each import that needs to be reimported. These log lines contain these messages: - import lacks carv2 path; import will not work; please reimport - import has missing/broken carv2; please reimport At the end, we will print a "sanity check completed" message indicating the count of imports found, and how many were deemed broken. Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com> Co-authored-by: Raúl Kripalani <raul@protocol.ai> Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
2021-08-16 22:34:32 +00:00
panic("implement me")
}
func (m mockLotusMount) GetUnpaddedCARSize(ctx context.Context, pieceCid cid.Cid) (uint64, error) {
panic("implement me")
}
func (m mockLotusMount) IsUnsealed(ctx context.Context, pieceCid cid.Cid) (bool, error) {
panic("implement me")
}
func getShardAccessor(t *testing.T) *dagstore.ShardAccessor {
data, err := os.ReadFile("./fixtures/sample-rw-bs-v2.car")
require.NoError(t, err)
buff := bytes.NewReader(data)
reader := &mount.NopCloser{Reader: buff, ReaderAt: buff, Seeker: buff}
shardAccessor, err := dagstore.NewShardAccessor(reader, nil, nil)
require.NoError(t, err)
return shardAccessor
}