75 lines
3.2 KiB
Markdown
75 lines
3.2 KiB
Markdown
|
Database Schema Migrations
|
||
|
====
|
||
|
|
||
|
This document is an attempt to record some best practices and design conventions for applying
|
||
|
database schema migrations within Lighthouse.
|
||
|
|
||
|
## General Structure
|
||
|
|
||
|
If you make a breaking change to an on-disk data structure you need to increment the
|
||
|
`SCHEMA_VERSION` in `beacon_node/store/src/metadata.rs` and add a migration from the previous
|
||
|
version to the new version.
|
||
|
|
||
|
The entry-point for database migrations is in `schema_change.rs`, _not_ `migrate.rs` (which deals
|
||
|
with finalization). Supporting code for a specific migration may be added in
|
||
|
`schema_change/migration_schema_vX.rs`, where `X` is the version being migrated _to_.
|
||
|
|
||
|
## Combining Schema Changes
|
||
|
|
||
|
Schema changes may be combined if they are part of the same pull request to
|
||
|
`unstable`. Once a schema version is defined in `unstable` we should not apply changes to it
|
||
|
without incrementing the version. This prevents conflicts between versions that appear to be the
|
||
|
same. This allows us to deploy `unstable` to nodes without having to worry about needing to resync
|
||
|
because of a sneaky schema change.
|
||
|
|
||
|
Changing the on-disk structure for a version _before_ it is merged to `unstable` is OK. You will
|
||
|
just have to handle manually resyncing any test nodes (use checkpoint sync).
|
||
|
|
||
|
## Naming Conventions
|
||
|
|
||
|
Prefer to name versions of structs by _the version at which the change was introduced_. For example
|
||
|
if you add a field to `Foo` in v9, call the previous version `FooV1` (assuming this is `Foo`'s first
|
||
|
migration) and write a schema change that migrates from `FooV1` to `FooV9`.
|
||
|
|
||
|
Prefer to use explicit version names in `schema_change.rs` and the `schema_change` module. To
|
||
|
interface with the outside either:
|
||
|
|
||
|
1. Define a type alias to the latest version, e.g. `pub type Foo = FooV9`, or
|
||
|
2. Define a mapping from the latest version to the version used elsewhere, e.g.
|
||
|
```rust
|
||
|
impl From<FooV9> for Foo {}
|
||
|
```
|
||
|
|
||
|
Avoid names like:
|
||
|
|
||
|
* `LegacyFoo`
|
||
|
* `OldFoo`
|
||
|
* `FooWithoutX`
|
||
|
|
||
|
## First-version vs Last-version
|
||
|
|
||
|
Previously the schema migration code would name types by the _last_ version at which they were
|
||
|
valid. For example if `Foo` changed in `V9` then we would name the two variants `FooV8` and `FooV9`.
|
||
|
The problem with this scheme is that if `Foo` changes again in the future at say v12 then `FooV9` would
|
||
|
need to be renamed to `FooV11`, which is annoying. Using the _first_ valid version as described
|
||
|
above does not have this issue.
|
||
|
|
||
|
## Using SuperStruct
|
||
|
|
||
|
If possible, consider using [`superstruct`](https://crates.io/crates/superstruct) to handle data
|
||
|
structure changes between versions.
|
||
|
|
||
|
* Use `superstruct(no_enum)` to avoid generating an unnecessary top-level enum.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
A field is added to `Foo` in v9, and there are two variants: `FooV1` and `FooV9`. There is a
|
||
|
migration from `FooV1` to `FooV9`. `Foo` is aliased to `FooV9`.
|
||
|
|
||
|
Some time later another field is added to `Foo` in v12. A new `FooV12` is created, along with a
|
||
|
migration from `FooV9` to `FooV12`. The primary `Foo` type gets re-aliased to `FooV12`. The previous
|
||
|
migration from V1 to V9 shouldn't break because the schema migration refers to `FooV9` explicitly
|
||
|
rather than `Foo`. Due to the re-aliasing (or re-mapping) the compiler will check every usage
|
||
|
of `Foo` to make sure that it still makes sense with `FooV12`.
|
||
|
|