The markets process instantiates its own `PubSub` instance with all
validators, peer scoring, etc. set up. Use that instane to join the
indexing topic, otherwise the default topic instantiated by
index-provider internally (via go-legs) has no validators.
- add opt-in env var to control instantation, until we are comfortable with testing to enble by default.
- adjust default limits if the connection manager high mark is higher than the default inbound conn limit.
Integrate the latest `index-provider` and reflect the changes to engine
configuration. Note that this commit disables announcements of indices
on the network by default as requested for initial merge to master.
Introduce dedicated index provider configuration parameters with
documentation and defaults that match the defaults in index-provider.
Re-generate code as needed.
- Add comment to clarify the reason for loop in testkit
- Trim common prefix in state printed in CLI commands for better
readability
- Upgrade to a tagged release of `go-fil-markets` that includes indexing
work; see: https://github.com/filecoin-project/go-fil-markets/pull/673
- Fix typo in CLI usage.
- Add comments to note that it is safe to use fx `OnStart` context when
starting the provider engine.
- Fix string concatenation in error message formatting.
Remove the bespoke instantiation of libp2p host and datatransfer manager
for index provider and reuse the existing instances used by markets.
The rationale for reuse is the following:
1. Separation of host introduces a discovery problem, where without
gossipsub the index provider endpoint will not be discoverable.
Using the same host as markets would mean the chain can be used to
discover addresses, putting less empassis on criticality of
gossipsub considering its set-up cost and lack of message delivery
guarantees.
2. Only a single instance of graphsync/datatransfer can be instantiated
per libp2p host; therefore, if the host is shared, so should
datatransfer manager.
3. it is not clear if the assumptions under which separation was
decided still hold.
Upgrade to the latest `index-provider` which upgrades the go-legs
protocol to allow the inclusion of extra gossip data that may be used
for gossip validation purposes. In the case of lotus gossip message
validators the miner ID is used to verify the sender's peer ID on chain.
Relates to:
- https://github.com/filecoin-project/lotus/pull/8045
Content providers announce the availability of indexer data using gossip pubsub. The content providers are not connected directly to indexers, so the pubsub messages are relayed to indexers via chain nodes. This PR makes chain nodes relay gossip pubsub messages, on the /indexer/ingest/<netname> topic.
Rename the config section corresponding to indexing to `IndexProvider`
for better readability.
Update existing docs for better clarity and add docs for config
parameters embedded from `index-provider` `Ingest` config library.
* adding the new variables- now time for logic
* putting parameters into right placeS
* adding unsealing throttle
* fixing linter issues
* removing one last thing...
This commit removes badger from the deal-making processes, and
moves to a new architecture with the dagstore as the cental
component on the miner-side, and CARv2s on the client-side.
Every deal that has been handed off to the sealing subsystem becomes
a shard in the dagstore. Shards are mounted via the LotusMount, which
teaches the dagstore how to load the related piece when serving
retrievals.
When the miner starts the Lotus for the first time with this patch,
we will perform a one-time migration of all active deals into the
dagstore. This is a lightweight process, and it consists simply
of registering the shards in the dagstore.
Shards are backed by the unsealed copy of the piece. This is currently
a CARv1. However, the dagstore keeps CARv2 indices for all pieces, so
when it's time to acquire a shard to serve a retrieval, the unsealed
CARv1 is joined with its index (safeguarded by the dagstore), to form
a read-only blockstore, thus taking the place of the monolithic
badger.
Data transfers have been adjusted to interface directly with CARv2 files.
On inbound transfers (client retrievals, miner storage deals), we stream
the received data into a CARv2 ReadWrite blockstore. On outbound transfers
(client storage deals, miner retrievals), we serve the data off a CARv2
ReadOnly blockstore.
Client-side imports are managed by the refactored *imports.Manager
component (when not using IPFS integration). Just like it before, we use
the go-filestore library to avoid duplicating the data from the original
file in the resulting UnixFS DAG (concretely the leaves). However, the
target of those imports are what we call "ref-CARv2s": CARv2 files placed
under the `$LOTUS_PATH/imports` directory, containing the intermediate
nodes in full, and the leaves as positional references to the original file
on disk.
Client-side retrievals are placed into CARv2 files in the location:
`$LOTUS_PATH/retrievals`.
A new set of `Dagstore*` JSON-RPC operations and `lotus-miner dagstore`
subcommands have been introduced on the miner-side to inspect and manage
the dagstore.
Despite moving to a CARv2-backed system, the IPFS integration has been
respected, and it continues to be possible to make storage deals with data
held in an IPFS node, and to perform retrievals directly into an IPFS node.
NOTE: because the "staging" and "client" Badger blockstores are no longer
used, existing imports on the client will be rendered useless. On startup,
Lotus will enumerate all imports and print WARN statements on the log for
each import that needs to be reimported. These log lines contain these
messages:
- import lacks carv2 path; import will not work; please reimport
- import has missing/broken carv2; please reimport
At the end, we will print a "sanity check completed" message indicating
the count of imports found, and how many were deemed broken.
Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com>
Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
Co-authored-by: Raúl Kripalani <raul@protocol.ai>
Co-authored-by: Dirk McCormick <dirkmdev@gmail.com>
- remove defunct tracking store implementations
- update splitstore node config
- use mark set type config option (defaulting to mapts); a memory constrained node
may want to use an on-disk one