## Issue Addressed NA ## Proposed Changes Implements the "union" type from the SSZ spec for `ssz`, `ssz_derive`, `tree_hash` and `tree_hash_derive` so it may be derived for `enums`: https://github.com/ethereum/consensus-specs/blob/v1.1.0-beta.3/ssz/simple-serialize.md#union The union type is required for the merge, since the `Transaction` type is defined as a single-variant union `Union[OpaqueTransaction]`. ### Crate Updates This PR will (hopefully) cause CI to publish new versions for the following crates: - `eth2_ssz_derive`: `0.2.1` -> `0.3.0` - `eth2_ssz`: `0.3.0` -> `0.4.0` - `eth2_ssz_types`: `0.2.0` -> `0.2.1` - `tree_hash`: `0.3.0` -> `0.4.0` - `tree_hash_derive`: `0.3.0` -> `0.4.0` These these crates depend on each other, I've had to add a workspace-level `[patch]` for these crates. A follow-up PR will need to remove this patch, ones the new versions are published. ### Union Behaviors We already had SSZ `Encode` and `TreeHash` derive for enums, however it just did a "transparent" pass-through of the inner value. Since the "union" decoding from the spec is in conflict with the transparent method, I've required that all `enum` have exactly one of the following enum-level attributes: #### SSZ - `#[ssz(enum_behaviour = "union")]` - matches the spec used for the merge - `#[ssz(enum_behaviour = "transparent")]` - maintains existing functionality - not supported for `Decode` (never was) #### TreeHash - `#[tree_hash(enum_behaviour = "union")]` - matches the spec used for the merge - `#[tree_hash(enum_behaviour = "transparent")]` - maintains existing functionality This means that we can maintain the existing transparent behaviour, but all existing users will get a compile-time error until they explicitly opt-in to being transparent. ### Legacy Option Encoding Before this PR, we already had a union-esque encoding for `Option<T>`. However, this was with the *old* SSZ spec where the union selector was 4 bytes. During merge specification, the spec was changed to use 1 byte for the selector. Whilst the 4-byte `Option` encoding was never used in the spec, we used it in our database. Writing a migrate script for all occurrences of `Option` in the database would be painful, especially since it's used in the `CommitteeCache`. To avoid the migrate script, I added a serde-esque `#[ssz(with = "module")]` field-level attribute to `ssz_derive` so that we can opt into the 4-byte encoding on a field-by-field basis. The `ssz::legacy::four_byte_impl!` macro allows a one-liner to define the module required for the `#[ssz(with = "module")]` for some `Option<T> where T: Encode + Decode`. Notably, **I have removed `Encode` and `Decode` impls for `Option`**. I've done this to force a break on downstream users. Like I mentioned, `Option` isn't used in the spec so I don't think it'll be *that* annoying. I think it's nicer than quietly having two different union implementations or quietly breaking the existing `Option` impl. ### Crate Publish Ordering I've modified the order in which CI publishes crates to ensure that we don't publish a crate without ensuring we already published a crate that it depends upon. ## TODO - [ ] Queue a follow-up `[patch]`-removing PR.
197 lines
6.1 KiB
Rust
197 lines
6.1 KiB
Rust
use super::*;
|
|
|
|
mod impls;
|
|
|
|
/// Provides SSZ encoding (serialization) via the `as_ssz_bytes(&self)` method.
|
|
///
|
|
/// See `examples/` for manual implementations or the crate root for implementations using
|
|
/// `#[derive(Encode)]`.
|
|
pub trait Encode {
|
|
/// Returns `true` if this object has a fixed-length.
|
|
///
|
|
/// I.e., there are no variable length items in this object or any of it's contained objects.
|
|
fn is_ssz_fixed_len() -> bool;
|
|
|
|
/// Append the encoding `self` to `buf`.
|
|
///
|
|
/// Note, variable length objects need only to append their "variable length" portion, they do
|
|
/// not need to provide their offset.
|
|
fn ssz_append(&self, buf: &mut Vec<u8>);
|
|
|
|
/// The number of bytes this object occupies in the fixed-length portion of the SSZ bytes.
|
|
///
|
|
/// By default, this is set to `BYTES_PER_LENGTH_OFFSET` which is suitable for variable length
|
|
/// objects, but not fixed-length objects. Fixed-length objects _must_ return a value which
|
|
/// represents their length.
|
|
fn ssz_fixed_len() -> usize {
|
|
BYTES_PER_LENGTH_OFFSET
|
|
}
|
|
|
|
/// Returns the size (in bytes) when `self` is serialized.
|
|
///
|
|
/// Returns the same value as `self.as_ssz_bytes().len()` but this method is significantly more
|
|
/// efficient.
|
|
fn ssz_bytes_len(&self) -> usize;
|
|
|
|
/// Returns the full-form encoding of this object.
|
|
///
|
|
/// The default implementation of this method should suffice for most cases.
|
|
fn as_ssz_bytes(&self) -> Vec<u8> {
|
|
let mut buf = vec![];
|
|
|
|
self.ssz_append(&mut buf);
|
|
|
|
buf
|
|
}
|
|
}
|
|
|
|
/// Allow for encoding an ordered series of distinct or indistinct objects as SSZ bytes.
|
|
///
|
|
/// **You must call `finalize(..)` after the final `append(..)` call** to ensure the bytes are
|
|
/// written to `buf`.
|
|
///
|
|
/// ## Example
|
|
///
|
|
/// Use `SszEncoder` to produce identical output to `foo.as_ssz_bytes()`:
|
|
///
|
|
/// ```rust
|
|
/// use ssz_derive::{Encode, Decode};
|
|
/// use ssz::{Decode, Encode, SszEncoder};
|
|
///
|
|
/// #[derive(PartialEq, Debug, Encode, Decode)]
|
|
/// struct Foo {
|
|
/// a: u64,
|
|
/// b: Vec<u16>,
|
|
/// }
|
|
///
|
|
/// fn ssz_encode_example() {
|
|
/// let foo = Foo {
|
|
/// a: 42,
|
|
/// b: vec![1, 3, 3, 7]
|
|
/// };
|
|
///
|
|
/// let mut buf: Vec<u8> = vec![];
|
|
/// let offset = <u64 as Encode>::ssz_fixed_len() + <Vec<u16> as Encode>::ssz_fixed_len();
|
|
///
|
|
/// let mut encoder = SszEncoder::container(&mut buf, offset);
|
|
///
|
|
/// encoder.append(&foo.a);
|
|
/// encoder.append(&foo.b);
|
|
///
|
|
/// encoder.finalize();
|
|
///
|
|
/// assert_eq!(foo.as_ssz_bytes(), buf);
|
|
/// }
|
|
///
|
|
/// ```
|
|
pub struct SszEncoder<'a> {
|
|
offset: usize,
|
|
buf: &'a mut Vec<u8>,
|
|
variable_bytes: Vec<u8>,
|
|
}
|
|
|
|
impl<'a> SszEncoder<'a> {
|
|
/// Instantiate a new encoder for encoding a SSZ container.
|
|
pub fn container(buf: &'a mut Vec<u8>, num_fixed_bytes: usize) -> Self {
|
|
buf.reserve(num_fixed_bytes);
|
|
|
|
Self {
|
|
offset: num_fixed_bytes,
|
|
buf,
|
|
variable_bytes: vec![],
|
|
}
|
|
}
|
|
|
|
/// Append some `item` to the SSZ bytes.
|
|
pub fn append<T: Encode>(&mut self, item: &T) {
|
|
self.append_parameterized(T::is_ssz_fixed_len(), |buf| item.ssz_append(buf))
|
|
}
|
|
|
|
/// Uses `ssz_append` to append the encoding of some item to the SSZ bytes.
|
|
pub fn append_parameterized<F>(&mut self, is_ssz_fixed_len: bool, ssz_append: F)
|
|
where
|
|
F: Fn(&mut Vec<u8>),
|
|
{
|
|
if is_ssz_fixed_len {
|
|
ssz_append(&mut self.buf);
|
|
} else {
|
|
self.buf
|
|
.extend_from_slice(&encode_length(self.offset + self.variable_bytes.len()));
|
|
|
|
ssz_append(&mut self.variable_bytes);
|
|
}
|
|
}
|
|
|
|
/// Write the variable bytes to `self.bytes`.
|
|
///
|
|
/// This method must be called after the final `append(..)` call when serializing
|
|
/// variable-length items.
|
|
pub fn finalize(&mut self) -> &mut Vec<u8> {
|
|
self.buf.append(&mut self.variable_bytes);
|
|
|
|
&mut self.buf
|
|
}
|
|
}
|
|
|
|
/// Encode `len` as a little-endian byte array of `BYTES_PER_LENGTH_OFFSET` length.
|
|
///
|
|
/// If `len` is larger than `2 ^ BYTES_PER_LENGTH_OFFSET`, a `debug_assert` is raised.
|
|
pub fn encode_length(len: usize) -> [u8; BYTES_PER_LENGTH_OFFSET] {
|
|
// Note: it is possible for `len` to be larger than what can be encoded in
|
|
// `BYTES_PER_LENGTH_OFFSET` bytes, triggering this debug assertion.
|
|
//
|
|
// These are the alternatives to using a `debug_assert` here:
|
|
//
|
|
// 1. Use `assert`.
|
|
// 2. Push an error to the caller (e.g., `Option` or `Result`).
|
|
// 3. Ignore it completely.
|
|
//
|
|
// I have avoided (1) because it's basically a choice between "produce invalid SSZ" or "kill
|
|
// the entire program". I figure it may be possible for an attacker to trigger this assert and
|
|
// take the program down -- I think producing invalid SSZ is a better option than this.
|
|
//
|
|
// I have avoided (2) because this error will need to be propagated upstream, making encoding a
|
|
// function which may fail. I don't think this is ergonomic and the upsides don't outweigh the
|
|
// downsides.
|
|
//
|
|
// I figure a `debug_assertion` is better than (3) as it will give us a change to detect the
|
|
// error during testing.
|
|
//
|
|
// If you have a different opinion, feel free to start an issue and tag @paulhauner.
|
|
debug_assert!(len <= MAX_LENGTH_VALUE);
|
|
|
|
let mut bytes = [0; BYTES_PER_LENGTH_OFFSET];
|
|
bytes.copy_from_slice(&len.to_le_bytes()[0..BYTES_PER_LENGTH_OFFSET]);
|
|
bytes
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::*;
|
|
|
|
#[test]
|
|
fn test_encode_length() {
|
|
assert_eq!(encode_length(0), [0; 4]);
|
|
|
|
assert_eq!(encode_length(1), [1, 0, 0, 0]);
|
|
|
|
assert_eq!(
|
|
encode_length(MAX_LENGTH_VALUE),
|
|
[255; BYTES_PER_LENGTH_OFFSET]
|
|
);
|
|
}
|
|
|
|
#[test]
|
|
#[should_panic]
|
|
#[cfg(debug_assertions)]
|
|
fn test_encode_length_above_max_debug_panics() {
|
|
encode_length(MAX_LENGTH_VALUE + 1);
|
|
}
|
|
|
|
#[test]
|
|
#[cfg(not(debug_assertions))]
|
|
fn test_encode_length_above_max_not_debug_does_not_panic() {
|
|
assert_eq!(&encode_length(MAX_LENGTH_VALUE + 1)[..], &[0; 4]);
|
|
}
|
|
}
|