go-ethereum/rlp/doc.go

162 lines
7.1 KiB
Go
Raw Normal View History

2015-07-07 00:54:22 +00:00
// Copyright 2014 The go-ethereum Authors
// This file is part of the go-ethereum library.
2015-07-07 00:54:22 +00:00
//
// The go-ethereum library is free software: you can redistribute it and/or modify
2015-07-07 00:54:22 +00:00
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The go-ethereum library is distributed in the hope that it will be useful,
2015-07-07 00:54:22 +00:00
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
2015-07-07 00:54:22 +00:00
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.
2015-07-07 00:54:22 +00:00
/*
Package rlp implements the RLP serialization format.
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
The purpose of RLP (Recursive Linear Prefix) is to encode arbitrarily nested arrays of
binary data, and RLP is the main encoding method used to serialize objects in Ethereum.
The only purpose of RLP is to encode structure; encoding specific atomic data types (eg.
strings, ints, floats) is left up to higher-order protocols. In Ethereum integers must be
represented in big endian binary form with no leading zeroes (thus making the integer
value zero equivalent to the empty string).
RLP values are distinguished by a type tag. The type tag precedes the value in the input
stream and defines the size and kind of the bytes that follow.
Encoding Rules
Package rlp uses reflection and encodes RLP based on the Go type of the value.
If the type implements the Encoder interface, Encode calls EncodeRLP. It does not
call EncodeRLP on nil pointer values.
To encode a pointer, the value being pointed to is encoded. A nil pointer to a struct
type, slice or array always encodes as an empty RLP list unless the slice or array has
elememt type byte. A nil pointer to any other value encodes as the empty string.
Struct values are encoded as an RLP list of all their encoded public fields. Recursive
struct types are supported.
To encode slices and arrays, the elements are encoded as an RLP list of the value's
elements. Note that arrays and slices with element type uint8 or byte are always encoded
as an RLP string.
A Go string is encoded as an RLP string.
An unsigned integer value is encoded as an RLP string. Zero always encodes as an empty RLP
string. big.Int values are treated as integers. Signed integers (int, int8, int16, ...)
are not supported and will return an error when encoding.
Boolean values are encoded as the unsigned integers zero (false) and one (true).
An interface value encodes as the value contained in the interface.
Floating point numbers, maps, channels and functions are not supported.
Decoding Rules
Decoding uses the following type-dependent rules:
If the type implements the Decoder interface, DecodeRLP is called.
To decode into a pointer, the value will be decoded as the element type of the pointer. If
the pointer is nil, a new value of the pointer's element type is allocated. If the pointer
is non-nil, the existing value will be reused. Note that package rlp never leaves a
pointer-type struct field as nil unless one of the "nil" struct tags is present.
To decode into a struct, decoding expects the input to be an RLP list. The decoded
elements of the list are assigned to each public field in the order given by the struct's
definition. The input list must contain an element for each decoded field. Decoding
returns an error if there are too few or too many elements for the struct.
To decode into a slice, the input must be a list and the resulting slice will contain the
input elements in order. For byte slices, the input must be an RLP string. Array types
decode similarly, with the additional restriction that the number of input elements (or
bytes) must match the array's defined length.
To decode into a Go string, the input must be an RLP string. The input bytes are taken
as-is and will not necessarily be valid UTF-8.
To decode into an unsigned integer type, the input must also be an RLP string. The bytes
are interpreted as a big endian representation of the integer. If the RLP string is larger
than the bit size of the type, decoding will return an error. Decode also supports
*big.Int. There is no size limit for big integers.
To decode into a boolean, the input must contain an unsigned integer of value zero (false)
or one (true).
To decode into an interface value, one of these types is stored in the value:
[]interface{}, for RLP lists
[]byte, for RLP strings
Non-empty interface types are not supported when decoding.
Signed integers, floating point numbers, maps, channels and functions cannot be decoded into.
Struct Tags
As with other encoding packages, the "-" tag ignores fields.
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
type StructWithIgnoredField struct{
Ignored uint `rlp:"-"`
Field uint
}
Go struct values encode/decode as RLP lists. There are two ways of influencing the mapping
of fields to list elements. The "tail" tag, which may only be used on the last exported
struct field, allows slurping up any excess list elements into a slice.
type StructWithTail struct{
Field uint
Tail []string `rlp:"tail"`
}
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
The "optional" tag says that the field may be omitted if it is zero-valued. If this tag is
used on a struct field, all subsequent public fields must also be declared optional.
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
When encoding a struct with optional fields, the output RLP list contains all values up to
the last non-zero optional field.
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
When decoding into a struct, optional fields may be omitted from the end of the input
list. For the example below, this means input lists of one, two, or three elements are
accepted.
type StructWithOptionalFields struct{
Required uint
Optional1 uint `rlp:"optional"`
Optional2 uint `rlp:"optional"`
}
The "nil", "nilList" and "nilString" tags apply to pointer-typed fields only, and change
the decoding rules for the field type. For regular pointer fields without the "nil" tag,
input values must always match the required input length exactly and the decoder does not
produce nil values. When the "nil" tag is set, input values of size zero decode as a nil
pointer. This is especially useful for recursive types.
type StructWithNilField struct {
Field *[3]byte `rlp:"nil"`
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
}
In the example above, Field allows two possible input sizes. For input 0xC180 (a list
containing an empty string) Field is set to nil after decoding. For input 0xC483000000 (a
list containing a 3-byte string), Field is set to a non-nil array pointer.
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
RLP supports two kinds of empty values: empty lists and empty strings. When using the
"nil" tag, the kind of empty value allowed for a type is chosen automatically. A field
whose Go type is a pointer to an unsigned integer, string, boolean or byte array/slice
expects an empty RLP string. Any other pointer field type encodes/decodes as an empty RLP
list.
rlp: improve nil pointer handling (#20064) * rlp: improve nil pointer handling In both encoder and decoder, the rules for encoding nil pointers were a bit hard to understand, and didn't leave much choice. Since RLP allows two empty values (empty list, empty string), any protocol built on RLP must choose either of these values to represent the null value in a certain context. This change adds choice in the form of two new struct tags, "nilString" and "nilList". These can be used to specify how a nil pointer value is encoded. The "nil" tag still exists, but its implementation is now explicit and defines exactly how nil pointers are handled in a single place. Another important change in this commit is how nil pointers and the Encoder interface interact. The EncodeRLP method was previously called even on nil values, which was supposed to give users a choice of how their value would be handled when nil. It turns out this is a stupid idea. If you create a network protocol containing an object defined in another package, it's better to be able to say that the object should be a list or string when nil in the definition of the protocol message rather than defining the encoding of nil on the object itself. As of this commit, the encoding rules for pointers now take precedence over the Encoder interface rule. I think the "nil" tag will work fine for most cases. For special kinds of objects which are a struct in Go but strings in RLP, code using the object can specify the desired encoding of nil using the "nilString" and "nilList" tags. * rlp: propagate struct field type errors If a struct contained fields of undecodable type, the encoder and decoder would panic instead of returning an error. Fix this by propagating type errors in makeStruct{Writer,Decoder} and add a test.
2019-09-13 09:10:57 +00:00
The choice of null value can be made explicit with the "nilList" and "nilString" struct
tags. Using these tags encodes/decodes a Go nil pointer value as the empty RLP value kind
defined by the tag.
*/
package rlp