forked from cerc-io/ipld-eth-server
438 lines
16 KiB
Go
438 lines
16 KiB
Go
package format
|
|
|
|
import (
|
|
"context"
|
|
"errors"
|
|
)
|
|
|
|
// Walker provides methods to move through a DAG of nodes that implement
|
|
// the `NavigableNode` interface. It uses iterative algorithms (instead
|
|
// of recursive ones) that expose the `path` of nodes from the root to
|
|
// the `ActiveNode` it currently points to.
|
|
//
|
|
// It provides multiple ways to walk through the DAG (e.g. `Iterate`
|
|
// and `Seek`). When using them, you provide a Visitor function that
|
|
// will be called for each node the Walker traverses. The Visitor can
|
|
// read data from those nodes and, optionally, direct the movement of
|
|
// the Walker by calling `Pause` (to stop traversing and return) or
|
|
// `NextChild` (to skip a child and its descendants). See the DAG reader
|
|
// in `github.com/ipfs/go-unixfs/io/dagreader.go` for a usage example.
|
|
// TODO: This example isn't merged yet.
|
|
type Walker struct {
|
|
|
|
// Sequence of nodes in the DAG from the root to the `ActiveNode`, each
|
|
// position in the slice being the parent of the next one. The `ActiveNode`
|
|
// resides in the position indexed by `currentDepth` (the slice may contain
|
|
// more elements past that point but they should be ignored since the slice
|
|
// is not truncated to leverage the already allocated space).
|
|
//
|
|
// Every time `down` is called the `currentDepth` increases and the child
|
|
// of the `ActiveNode` is inserted after it (effectively becoming the new
|
|
// `ActiveNode`).
|
|
//
|
|
// The slice must *always* have a length bigger than zero with the root
|
|
// of the DAG at the first position (empty DAGs are not valid).
|
|
path []NavigableNode
|
|
|
|
// Depth of the `ActiveNode`. It grows downwards, root being 0, its child 1,
|
|
// and so on. It controls the effective length of `path` and `childIndex`.
|
|
//
|
|
// A currentDepth of -1 signals the start case of a new `Walker` that hasn't
|
|
// moved yet. Although this state is an invalid index to the slices, it
|
|
// allows to centralize all the visit calls in the `down` move (starting at
|
|
// zero would require a special visit case inside every walk operation like
|
|
// `Iterate()` and `Seek`). This value should never be returned to after
|
|
// the first `down` movement, moving up from the root should always return
|
|
// `errUpOnRoot`.
|
|
currentDepth int
|
|
|
|
// This slice has the index of the child each node in `path` is pointing
|
|
// to. The child index in the node can be set past all of its child nodes
|
|
// (having a value equal to `ChildTotal`) to signal it has visited (or
|
|
// skipped) all of them. A leaf node with no children that has its index
|
|
// in zero would also comply with this format.
|
|
//
|
|
// Complement to `path`, not only do we need to know which nodes have been
|
|
// traversed to reach the `ActiveNode` but also which child nodes they are
|
|
// to correctly have the active path of the DAG. (Reword this paragraph.)
|
|
childIndex []uint
|
|
|
|
// Flag to signal that a pause in the current walk operation has been
|
|
// requested by the user inside `Visitor`.
|
|
pauseRequested bool
|
|
|
|
// Used to pass information from the central `Walker` structure to the
|
|
// distributed `NavigableNode`s (to have a centralized configuration
|
|
// structure to control the behavior of all of them), e.g., to tell
|
|
// the `NavigableIPLDNode` which context should be used to load node
|
|
// promises (but this could later be used in more elaborate ways).
|
|
ctx context.Context
|
|
}
|
|
|
|
// `Walker` implementation details:
|
|
//
|
|
// The `Iterate` and `Seek` walk operations are implemented through two
|
|
// basic move methods `up` and `down`, that change which node is the
|
|
// `ActiveNode` (modifying the `path` that leads to it). The `NextChild`
|
|
// method allows to change which child the `ActiveNode` is pointing to
|
|
// in order to change the direction of the descent.
|
|
//
|
|
// The `down` method is the analogous of a recursive call and the one in
|
|
// charge of visiting (possible new) nodes (through `Visitor`) and performing
|
|
// some user-defined logic. A `Pause` method is available to interrupt the
|
|
// current walk operation after visiting a node.
|
|
//
|
|
// Key terms and concepts:
|
|
// * Walk operation (e.g., `Iterate`).
|
|
// * Move methods: `up` and `down`.
|
|
// * Active node.
|
|
// * Path to the active node.
|
|
|
|
|
|
// Function called each time a node is arrived upon in a walk operation
|
|
// through the `down` method (not when going back `up`). It is the main
|
|
// API to implement DAG functionality (e.g., read and seek a file DAG)
|
|
// on top of the `Walker` structure.
|
|
//
|
|
// Its argument is the current `node` being visited (the `ActiveNode`).
|
|
// Any error it returns (apart from the internal `errPauseWalkOperation`)
|
|
// will be forwarded to the caller of the walk operation (pausing it).
|
|
//
|
|
// Any of the exported methods of this API should be allowed to be called
|
|
// from within this method, e.g., `NextChild`.
|
|
// TODO: Check that. Can `ResetPosition` be called without breaking
|
|
// the `Walker` integrity?
|
|
type Visitor func(node NavigableNode) error
|
|
|
|
// NavigableNode is the interface the nodes of a DAG need to implement in
|
|
// order to be traversed by the `Walker`.
|
|
type NavigableNode interface {
|
|
|
|
// FetchChild returns the child of this node pointed to by `childIndex`.
|
|
// A `Context` stored in the `Walker` is passed (`ctx`) that may contain
|
|
// configuration attributes stored by the user before initiating the
|
|
// walk operation.
|
|
FetchChild(ctx context.Context, childIndex uint) (NavigableNode, error)
|
|
|
|
// ChildTotal returns the number of children of the `ActiveNode`.
|
|
ChildTotal() uint
|
|
|
|
// TODO: Evaluate providing the `Cleanup` and `Reset` methods.
|
|
|
|
// Cleanup is an optional method that is called by the `Walker` when
|
|
// this node leaves the active `path`, i.e., when this node is the
|
|
// `ActiveNode` and the `up` movement is called.
|
|
//Cleanup()
|
|
// Allow this method to return an error? That would imply
|
|
// modifying the `Walker` API, `up()` would now return an error
|
|
// different than `errUpOnRoot`.
|
|
|
|
// Reset is an optional function that is called by the `Walker` when
|
|
// `ResetPosition` is called, it is only applied to the root node
|
|
// of the DAG.
|
|
//Reset()
|
|
}
|
|
|
|
// NewWalker creates a new `Walker` structure from a `root`
|
|
// NavigableNode.
|
|
func NewWalker(ctx context.Context, root NavigableNode) *Walker {
|
|
return &Walker{
|
|
ctx: ctx,
|
|
|
|
path: []NavigableNode{root},
|
|
childIndex: []uint{0},
|
|
|
|
currentDepth: -1,
|
|
// Starting position, "on top" of the root node, see `currentDepth`.
|
|
}
|
|
}
|
|
|
|
// ActiveNode returns the `NavigableNode` that `Walker` is pointing
|
|
// to at the moment. It changes when `up` or `down` is called.
|
|
func (w *Walker) ActiveNode() NavigableNode {
|
|
return w.path[w.currentDepth]
|
|
// TODO: Add a check for the initial state of `currentDepth` -1?
|
|
}
|
|
|
|
// ErrDownNoChild signals there is no child at `ActiveChildIndex` in the
|
|
// `ActiveNode` to go down to.
|
|
var ErrDownNoChild = errors.New("can't go down, the child does not exist")
|
|
|
|
// errUpOnRoot signals the end of the DAG after returning to the root.
|
|
var errUpOnRoot = errors.New("can't go up, already on root")
|
|
|
|
// EndOfDag wraps the `errUpOnRoot` and signals to the user that the
|
|
// entire DAG has been iterated.
|
|
var EndOfDag = errors.New("end of DAG")
|
|
|
|
// ErrNextNoChild signals the end of this parent child nodes.
|
|
var ErrNextNoChild = errors.New("can't go to the next child, no more child nodes in this parent")
|
|
|
|
// errPauseWalkOperation signals the pause of the walk operation.
|
|
var errPauseWalkOperation = errors.New("pause in the current walk operation")
|
|
|
|
// ErrNilVisitor signals the lack of a `Visitor` function.
|
|
var ErrNilVisitor = errors.New("no Visitor function specified")
|
|
|
|
// Iterate the DAG through the DFS pre-order walk algorithm, going down
|
|
// as much as possible, then `NextChild` to the other siblings, and then up
|
|
// (to go down again). The position is saved throughout iterations (and
|
|
// can be previously set in `Seek`) allowing `Iterate` to be called
|
|
// repeatedly (after a `Pause`) to continue the iteration.
|
|
//
|
|
// This function returns the errors received from `down` (generated either
|
|
// inside the `Visitor` call or any other errors while fetching the child
|
|
// nodes), the rest of the move errors are handled within the function and
|
|
// are not returned.
|
|
func (w *Walker) Iterate(visitor Visitor) error {
|
|
|
|
// Iterate until either: the end of the DAG (`errUpOnRoot`), a `Pause`
|
|
// is requested (`errPauseWalkOperation`) or an error happens (while
|
|
// going down).
|
|
for {
|
|
|
|
// First, go down as much as possible.
|
|
for {
|
|
err := w.down(visitor)
|
|
|
|
if err == ErrDownNoChild {
|
|
break
|
|
// Can't keep going down from this node, try to move Next.
|
|
}
|
|
|
|
if err == errPauseWalkOperation {
|
|
return nil
|
|
// Pause requested, `errPauseWalkOperation` is just an internal
|
|
// error to signal to pause, don't pass it along.
|
|
}
|
|
|
|
if err != nil {
|
|
return err
|
|
// `down` is the only movement that can return *any* error.
|
|
}
|
|
}
|
|
|
|
// Can't move down anymore, turn to the next child in the `ActiveNode`
|
|
// to go down a different path. If there are no more child nodes
|
|
// available, go back up.
|
|
for {
|
|
err := w.NextChild()
|
|
if err == nil {
|
|
break
|
|
// No error, it turned to the next child. Try to go down again.
|
|
}
|
|
|
|
// It can't go Next (`ErrNextNoChild`), try to move up.
|
|
err = w.up()
|
|
if err != nil {
|
|
// Can't move up, on the root again (`errUpOnRoot`).
|
|
return EndOfDag
|
|
}
|
|
|
|
// Moved up, try `NextChild` again.
|
|
}
|
|
|
|
// Turned to the next child (after potentially many up moves),
|
|
// try going down again.
|
|
}
|
|
}
|
|
|
|
// Seek a specific node in a downwards manner. The `Visitor` should be
|
|
// used to steer the seek selecting at each node which child will the
|
|
// seek continue to (extending the `path` in that direction) or pause it
|
|
// (if the desired node has been found). The seek always starts from
|
|
// the root. It modifies the position so it shouldn't be used in-between
|
|
// `Iterate` calls (it can be used to set the position *before* iterating).
|
|
// If the visitor returns any non-`nil` errors the seek will stop.
|
|
//
|
|
// TODO: The seek could be extended to seek from the current position.
|
|
// (Is there something in the logic that would prevent it at the moment?)
|
|
func (w *Walker) Seek(visitor Visitor) error {
|
|
|
|
if visitor == nil {
|
|
return ErrNilVisitor
|
|
// Although valid, there is no point in calling `Seek` without
|
|
// any extra logic, it would just go down to the leftmost leaf,
|
|
// so this would probably be a user error.
|
|
}
|
|
|
|
// Go down until it the desired node is found (that will be signaled
|
|
// pausing the seek with `errPauseWalkOperation`) or a leaf node is
|
|
// reached (end of the DAG).
|
|
for {
|
|
err := w.down(visitor)
|
|
|
|
if err == errPauseWalkOperation {
|
|
return nil
|
|
// Found the node, `errPauseWalkOperation` is just an internal
|
|
// error to signal to pause, don't pass it along.
|
|
}
|
|
|
|
if err == ErrDownNoChild {
|
|
return nil
|
|
// Can't keep going down from this node, either at a leaf node
|
|
// or the `Visitor` has moved the child index past the
|
|
// available index (probably because none indicated that the
|
|
// target node could be down from there).
|
|
}
|
|
|
|
if err != nil {
|
|
return err
|
|
// `down()` is the only movement that can return *any* error.
|
|
}
|
|
}
|
|
// TODO: Copied from the first part of `Iterate()` (although conceptually
|
|
// different from it). Could this be encapsulated in a function to avoid
|
|
// repeating code? The way the pause signal is handled it wouldn't seem
|
|
// very useful: the `errPauseWalkOperation` needs to be processed at this
|
|
// depth to return from the function (and pause the seek, returning
|
|
// from another function here wouldn't cause it to stop).
|
|
}
|
|
|
|
// Go down one level in the DAG to the child of the `ActiveNode`
|
|
// pointed to by `ActiveChildIndex` and perform some logic on it by
|
|
// through the user-specified `visitor`.
|
|
//
|
|
// This should always be the first move in any walk operation
|
|
// (to visit the root node and move the `currentDepth` away
|
|
// from the negative value).
|
|
func (w *Walker) down(visitor Visitor) error {
|
|
child, err := w.fetchChild()
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
w.extendPath(child)
|
|
|
|
return w.visitActiveNode(visitor)
|
|
}
|
|
|
|
// Fetch the child from the `ActiveNode` through the `FetchChild`
|
|
// method of the `NavigableNode` interface.
|
|
func (w *Walker) fetchChild() (NavigableNode, error) {
|
|
if w.currentDepth == -1 {
|
|
// First time `down()` is called, `currentDepth` is -1,
|
|
// return the root node. Don't check available child nodes
|
|
// (as the `Walker` is not actually on any node just yet
|
|
// and `ActiveChildIndex` is of no use yet).
|
|
return w.path[0], nil
|
|
}
|
|
|
|
// Check if the child to fetch exists.
|
|
if w.ActiveChildIndex() >= w.ActiveNode().ChildTotal() {
|
|
return nil, ErrDownNoChild
|
|
}
|
|
|
|
return w.ActiveNode().FetchChild(w.ctx, w.ActiveChildIndex())
|
|
|
|
// TODO: Maybe call `extendPath` here and hide it away
|
|
// from `down`.
|
|
}
|
|
|
|
// Increase the `currentDepth` and extend the `path` to the fetched
|
|
// `child` node (which now becomes the new `ActiveNode`)
|
|
func (w *Walker) extendPath(child NavigableNode) {
|
|
w.currentDepth++
|
|
|
|
// Extend the slices if needed (doubling its capacity).
|
|
if w.currentDepth >= len(w.path) {
|
|
w.path = append(w.path, make([]NavigableNode, len(w.path))...)
|
|
w.childIndex = append(w.childIndex, make([]uint, len(w.childIndex))...)
|
|
// TODO: Check the performance of this grow mechanism.
|
|
}
|
|
|
|
// `child` now becomes the `ActiveNode()`.
|
|
w.path[w.currentDepth] = child
|
|
w.childIndex[w.currentDepth] = 0
|
|
}
|
|
|
|
// Call the `Visitor` on the `ActiveNode`. This function should only be
|
|
// called from `down`. This is a wrapper function to `Visitor` to process
|
|
// the `Pause` signal and do other minor checks (taking this logic away
|
|
// from `down`).
|
|
func (w *Walker) visitActiveNode(visitor Visitor) error {
|
|
if visitor == nil {
|
|
return nil
|
|
// No need to check `pauseRequested` as `Pause` should
|
|
// only be called from within the `Visitor`.
|
|
}
|
|
|
|
err := visitor(w.ActiveNode())
|
|
|
|
if w.pauseRequested {
|
|
// If a pause was requested make sure an error is returned
|
|
// that will cause the current walk operation to return. If
|
|
// `Visitor` didn't return an error set an artificial one
|
|
// generated by the `Walker`.
|
|
if err == nil {
|
|
err = errPauseWalkOperation
|
|
}
|
|
|
|
w.pauseRequested = false
|
|
}
|
|
|
|
return err
|
|
}
|
|
|
|
// Go up from the `ActiveNode`. The only possible error this method
|
|
// can return is to signal it's already at the root and can't go up.
|
|
func (w *Walker) up() error {
|
|
if w.currentDepth < 1 {
|
|
return errUpOnRoot
|
|
}
|
|
|
|
w.currentDepth--
|
|
|
|
// w.ActiveNode().Cleanup()
|
|
// If `Cleanup` is supported this would be the place to call it.
|
|
|
|
return nil
|
|
}
|
|
|
|
// NextChild increases the child index of the `ActiveNode` to point
|
|
// to the next child (which may exist or may be the end of the available
|
|
// child nodes).
|
|
//
|
|
// This method doesn't change the `ActiveNode`, it just changes where
|
|
// is it pointing to next, it could be interpreted as "turn to the next
|
|
// child".
|
|
func (w *Walker) NextChild() error {
|
|
w.incrementActiveChildIndex()
|
|
|
|
if w.ActiveChildIndex() == w.ActiveNode().ChildTotal() {
|
|
return ErrNextNoChild
|
|
// At the end of the available children, signal it.
|
|
}
|
|
|
|
return nil
|
|
}
|
|
|
|
// incrementActiveChildIndex increments the child index of the `ActiveNode` to
|
|
// point to the next child (if it exists) or to the position past all of
|
|
// the child nodes (`ChildTotal`) to signal that all of its children have
|
|
// been visited/skipped (if already at that last position, do nothing).
|
|
func (w *Walker) incrementActiveChildIndex() {
|
|
if w.ActiveChildIndex()+1 <= w.ActiveNode().ChildTotal() {
|
|
w.childIndex[w.currentDepth]++
|
|
}
|
|
}
|
|
|
|
// ActiveChildIndex returns the index of the child the `ActiveNode()`
|
|
// is pointing to.
|
|
func (w *Walker) ActiveChildIndex() uint {
|
|
return w.childIndex[w.currentDepth]
|
|
}
|
|
|
|
// SetContext changes the internal `Walker` (that is provided to the
|
|
// `NavigableNode`s when calling `FetchChild`) with the one passed
|
|
// as argument.
|
|
func (w *Walker) SetContext(ctx context.Context) {
|
|
w.ctx = ctx
|
|
}
|
|
|
|
// Pause the current walk operation. This function must be called from
|
|
// within the `Visitor` function.
|
|
func (w *Walker) Pause() {
|
|
w.pauseRequested = true
|
|
}
|