lighthouse/beacon_node
Michael Sproul 2f456ff9eb Fix regression in DB write atomicity (#3931)
## Issue Addressed

Fix a bug introduced by #3696. The bug is not expected to occur frequently, so releasing this PR is non-urgent.

## Proposed Changes

* Add a variant to `StoreOp` that allows a raw KV operation to be passed around.
* Return to using `self.store.do_atomically` rather than `self.store.hot_db.do_atomically`. This streamlines the write back into a single call and makes our auto-revert work again.
* Prevent `import_block_update_shuffling_cache` from failing block import. This is an outstanding bug from before v3.4.0 which may have contributed to some random unexplained database corruption.

## Additional Info

In #3696 I split the database write into two calls, one to convert the `StoreOp`s to `KeyValueStoreOp`s and one to write them. This had the unfortunate side-effect of damaging our atomicity guarantees in case of a write error. If the first call failed, we would be left with the block in fork choice but not on-disk (or the snapshot cache), which would prevent us from processing any descendant blocks. On `unstable` the first call is very unlikely to fail unless the disk is full, but on `tree-states` the conversion is more involved and a user reported database corruption after it failed in a way that should have been recoverable.

Additionally, as @emhane observed, #3696 also inadvertently removed the import of the new block into the block cache. Although this seems like it could have negatively impacted performance, there are several mitigating factors:

- For regular block processing we should almost always load the parent block (and state) from the snapshot cache.
- We often load blinded blocks, which bypass the block cache anyway.
- Metrics show no noticeable increase in the block cache miss rate with v3.4.0.

However, I expect the block cache _will_ be useful again in `tree-states`, so it is restored to use by this PR.
2023-02-13 03:32:01 +00:00
..
beacon_chain Fix regression in DB write atomicity (#3931) 2023-02-13 03:32:01 +00:00
builder_client Verify execution block hashes during finalized sync (#3794) 2023-01-09 03:11:59 +00:00
client Improve validator monitor experience for high validator counts (#3728) 2023-01-09 08:18:55 +00:00
eth1 Clippy lints for rust 1.66 (#3810) 2022-12-16 04:04:00 +00:00
execution_layer Reduce some EE and builder related ERRO logs to WARN (#3966) 2023-02-12 23:14:08 +00:00
genesis Super small improvement: Remove unnecessary mut (#3736) 2022-11-21 03:15:54 +00:00
http_api Reduce some EE and builder related ERRO logs to WARN (#3966) 2023-02-12 23:14:08 +00:00
http_metrics Support IPv6 in BN and VC HTTP APIs (#3104) 2022-03-24 00:04:49 +00:00
lighthouse_network Self rate limiting dev flag (#3928) 2023-02-08 02:18:53 +00:00
network Reduce some EE and builder related ERRO logs to WARN (#3966) 2023-02-12 23:14:08 +00:00
operation_pool Implement block_rewards API (per-validator reward) (#3907) 2023-02-07 08:33:23 +00:00
src Self rate limiting dev flag (#3928) 2023-02-08 02:18:53 +00:00
store Fix regression in DB write atomicity (#3931) 2023-02-13 03:32:01 +00:00
tests Altair consensus changes and refactors (#2279) 2021-07-09 06:15:32 +00:00
timer Use async code when interacting with EL (#3244) 2022-07-03 05:36:50 +00:00
Cargo.toml Release v3.4.0 (#3862) 2023-01-11 03:27:08 +00:00