cosmos-sdk/orm/encoding/ormfield/codec.go
Aaron Craelius 1944a0883e
feat(orm)!: ordered variable length encoding for uint32 and uint64 types (#11090)
## Description

`uint64` values are used in the ORM as auto-incrementing primary keys. Always using 8 bytes for these values is a bit of a waste of space. Unfortunately, varint encoding does not support ordered prefix iteration.

This PR introduces a compact, well-ordered variable length encoding for `uint32` and `uint64` types. `fixed32` and `fixed64` integers are still encoded as 4 and 8 byte fixed-length big-endian arrays. With this, users have a choice of encoding based on what type of data they are storing. An auto-incrementing primary key should prefer the variable length `uint64` whereas a fixed precision decimal might want to use `fixed64`.

See the golden test updates to see how this reduces key lengths.

This encoding works by using the first two bits to encode the buffer length (4 possible lengths). I'm not sure if my choice of 2,4,6 and 9 bytes is the right choice of 4 lenths for `uint64` - there are many alternate choices. I could have also chosen 3 bits and allowed for 8 possible lengths, but way waste an extra bit? Input on the right design parameters would be appreciated.



---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
2022-02-07 17:58:55 +00:00

111 lines
3.3 KiB
Go

package ormfield
import (
"io"
"github.com/cosmos/cosmos-sdk/orm/types/ormerrors"
"google.golang.org/protobuf/types/known/durationpb"
"google.golang.org/protobuf/types/known/timestamppb"
"google.golang.org/protobuf/reflect/protoreflect"
)
// Codec defines an interface for decoding and encoding values in ORM index keys.
type Codec interface {
// Decode decodes a value in a key.
Decode(r Reader) (protoreflect.Value, error)
// Encode encodes a value in a key.
Encode(value protoreflect.Value, w io.Writer) error
// Compare compares two values of this type and should primarily be used
// for testing.
Compare(v1, v2 protoreflect.Value) int
// IsOrdered returns true if callers can always assume that this ordering
// is suitable for sorted iteration.
IsOrdered() bool
// FixedBufferSize returns a positive value if encoders should assume a
// fixed size buffer for encoding. Encoders will use at most this much size
// to encode the value.
FixedBufferSize() int
// ComputeBufferSize estimates the buffer size needed to encode the field.
// Encoders will use at most this much size to encode the value.
ComputeBufferSize(value protoreflect.Value) (int, error)
}
type Reader interface {
io.Reader
io.ByteReader
}
var (
timestampMsgType = (&timestamppb.Timestamp{}).ProtoReflect().Type()
timestampFullName = timestampMsgType.Descriptor().FullName()
durationMsgType = (&durationpb.Duration{}).ProtoReflect().Type()
durationFullName = durationMsgType.Descriptor().FullName()
)
// GetCodec returns the Codec for the provided field if one is defined.
// nonTerminal should be set to true if this value is being encoded as a
// non-terminal segment of a multi-part key.
func GetCodec(field protoreflect.FieldDescriptor, nonTerminal bool) (Codec, error) {
if field == nil {
return nil, ormerrors.UnsupportedKeyField.Wrap("nil field")
}
if field.IsList() {
return nil, ormerrors.UnsupportedKeyField.Wrapf("repeated field %s", field.FullName())
}
if field.ContainingOneof() != nil {
return nil, ormerrors.UnsupportedKeyField.Wrapf("oneof field %s", field.FullName())
}
switch field.Kind() {
case protoreflect.BytesKind:
if nonTerminal {
return NonTerminalBytesCodec{}, nil
} else {
return BytesCodec{}, nil
}
case protoreflect.StringKind:
if nonTerminal {
return NonTerminalStringCodec{}, nil
} else {
return StringCodec{}, nil
}
case protoreflect.Uint32Kind:
return CompactUint32Codec{}, nil
case protoreflect.Fixed32Kind:
return FixedUint32Codec{}, nil
case protoreflect.Uint64Kind:
return CompactUint64Codec{}, nil
case protoreflect.Fixed64Kind:
return FixedUint64Codec{}, nil
case protoreflect.Int32Kind, protoreflect.Sint32Kind, protoreflect.Sfixed32Kind:
return Int32Codec{}, nil
case protoreflect.Int64Kind, protoreflect.Sint64Kind, protoreflect.Sfixed64Kind:
return Int64Codec{}, nil
case protoreflect.BoolKind:
return BoolCodec{}, nil
case protoreflect.EnumKind:
return EnumCodec{}, nil
case protoreflect.MessageKind:
msgName := field.Message().FullName()
switch msgName {
case timestampFullName:
return TimestampCodec{}, nil
case durationFullName:
return DurationCodec{}, nil
default:
return nil, ormerrors.UnsupportedKeyField.Wrapf("%s of type %s", field.FullName(), msgName)
}
default:
return nil, ormerrors.UnsupportedKeyField.Wrapf("%s of kind %s", field.FullName(), field.Kind())
}
}