lighthouse/testing/execution_engine_integration/src/test_rig.rs

622 lines
19 KiB
Rust
Raw Normal View History

use crate::execution_engine::{
ExecutionEngine, GenericExecutionEngine, ACCOUNT1, ACCOUNT2, KEYSTORE_PASSWORD, PRIVATE_KEYS,
};
use crate::transactions::transactions;
use ethers_providers::Middleware;
use execution_layer::{
BuilderParams, ChainHealth, ExecutionLayer, PayloadAttributes, PayloadStatus,
};
use fork_choice::ForkchoiceUpdateParameters;
use reqwest::{header::CONTENT_TYPE, Client};
use sensitive_url::SensitiveUrl;
use serde_json::{json, Value};
use std::sync::Arc;
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
use task_executor::TaskExecutor;
use tokio::time::sleep;
use types::{
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
Address, ChainSpec, EthSpec, ExecutionBlockHash, ExecutionPayload, FullPayload, Hash256,
MainnetEthSpec, PublicKeyBytes, Slot, Uint256,
};
const EXECUTION_ENGINE_START_TIMEOUT: Duration = Duration::from_secs(20);
struct ExecutionPair<E, T: EthSpec> {
/// The Lighthouse `ExecutionLayer` struct, connected to the `execution_engine` via HTTP.
execution_layer: ExecutionLayer<T>,
/// A handle to external EE process, once this is dropped the process will be killed.
#[allow(dead_code)]
execution_engine: ExecutionEngine<E>,
}
/// A rig that holds two EE processes for testing.
///
/// There are two EEs held here so that we can test out-of-order application of payloads, and other
/// edge-cases.
pub struct TestRig<E, T: EthSpec = MainnetEthSpec> {
#[allow(dead_code)]
runtime: Arc<tokio::runtime::Runtime>,
ee_a: ExecutionPair<E, T>,
ee_b: ExecutionPair<E, T>,
spec: ChainSpec,
_runtime_shutdown: exit_future::Signal,
}
/// Import a private key into the execution engine and unlock it so that we can
/// make transactions with the corresponding account.
async fn import_and_unlock(http_url: SensitiveUrl, priv_keys: &[&str], password: &str) {
for priv_key in priv_keys {
let body = json!(
{
"jsonrpc":"2.0",
"method":"personal_importRawKey",
"params":[priv_key, password],
"id":1
}
);
let client = Client::builder().build().unwrap();
let request = client
.post(http_url.full.clone())
.header(CONTENT_TYPE, "application/json")
.json(&body);
let response: Value = request
.send()
.await
.unwrap()
.error_for_status()
.unwrap()
.json()
.await
.unwrap();
let account = response.get("result").unwrap().as_str().unwrap();
let body = json!(
{
"jsonrpc":"2.0",
"method":"personal_unlockAccount",
"params":[account, password],
"id":1
}
);
let request = client
.post(http_url.full.clone())
.header(CONTENT_TYPE, "application/json")
.json(&body);
let _response: Value = request
.send()
.await
.unwrap()
.error_for_status()
.unwrap()
.json()
.await
.unwrap();
}
}
impl<E: GenericExecutionEngine> TestRig<E> {
pub fn new(generic_engine: E) -> Self {
let log = environment::null_logger().unwrap();
let runtime = Arc::new(
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap(),
);
let (runtime_shutdown, exit) = exit_future::signal();
let (shutdown_tx, _) = futures::channel::mpsc::channel(1);
let executor = TaskExecutor::new(Arc::downgrade(&runtime), exit, log.clone(), shutdown_tx);
let fee_recipient = None;
let ee_a = {
let execution_engine = ExecutionEngine::new(generic_engine.clone());
let urls = vec![execution_engine.http_auth_url()];
let config = execution_layer::Config {
execution_endpoints: urls,
secret_files: vec![],
suggested_fee_recipient: Some(Address::repeat_byte(42)),
default_datadir: execution_engine.datadir(),
..Default::default()
};
let execution_layer =
ExecutionLayer::from_config(config, executor.clone(), log.clone()).unwrap();
ExecutionPair {
execution_engine,
execution_layer,
}
};
let ee_b = {
let execution_engine = ExecutionEngine::new(generic_engine);
let urls = vec![execution_engine.http_auth_url()];
let config = execution_layer::Config {
execution_endpoints: urls,
secret_files: vec![],
suggested_fee_recipient: fee_recipient,
default_datadir: execution_engine.datadir(),
..Default::default()
};
let execution_layer =
ExecutionLayer::from_config(config, executor, log.clone()).unwrap();
ExecutionPair {
execution_engine,
execution_layer,
}
};
let mut spec = MainnetEthSpec::default_spec();
spec.terminal_total_difficulty = Uint256::zero();
Self {
runtime,
ee_a,
ee_b,
spec,
_runtime_shutdown: runtime_shutdown,
}
}
pub fn perform_tests_blocking(&self) {
Use async code when interacting with EL (#3244) ## Overview This rather extensive PR achieves two primary goals: 1. Uses the finalized/justified checkpoints of fork choice (FC), rather than that of the head state. 2. Refactors fork choice, block production and block processing to `async` functions. Additionally, it achieves: - Concurrent forkchoice updates to the EL and cache pruning after a new head is selected. - Concurrent "block packing" (attestations, etc) and execution payload retrieval during block production. - Concurrent per-block-processing and execution payload verification during block processing. - The `Arc`-ification of `SignedBeaconBlock` during block processing (it's never mutated, so why not?): - I had to do this to deal with sending blocks into spawned tasks. - Previously we were cloning the beacon block at least 2 times during each block processing, these clones are either removed or turned into cheaper `Arc` clones. - We were also `Box`-ing and un-`Box`-ing beacon blocks as they moved throughout the networking crate. This is not a big deal, but it's nice to avoid shifting things between the stack and heap. - Avoids cloning *all the blocks* in *every chain segment* during sync. - It also has the potential to clean up our code where we need to pass an *owned* block around so we can send it back in the case of an error (I didn't do much of this, my PR is already big enough :sweat_smile:) - The `BeaconChain::HeadSafetyStatus` struct was removed. It was an old relic from prior merge specs. For motivation for this change, see https://github.com/sigp/lighthouse/pull/3244#issuecomment-1160963273 ## Changes to `canonical_head` and `fork_choice` Previously, the `BeaconChain` had two separate fields: ``` canonical_head: RwLock<Snapshot>, fork_choice: RwLock<BeaconForkChoice> ``` Now, we have grouped these values under a single struct: ``` canonical_head: CanonicalHead { cached_head: RwLock<Arc<Snapshot>>, fork_choice: RwLock<BeaconForkChoice> } ``` Apart from ergonomics, the only *actual* change here is wrapping the canonical head snapshot in an `Arc`. This means that we no longer need to hold the `cached_head` (`canonical_head`, in old terms) lock when we want to pull some values from it. This was done to avoid deadlock risks by preventing functions from acquiring (and holding) the `cached_head` and `fork_choice` locks simultaneously. ## Breaking Changes ### The `state` (root) field in the `finalized_checkpoint` SSE event Consider the scenario where epoch `n` is just finalized, but `start_slot(n)` is skipped. There are two state roots we might in the `finalized_checkpoint` SSE event: 1. The state root of the finalized block, which is `get_block(finalized_checkpoint.root).state_root`. 4. The state root at slot of `start_slot(n)`, which would be the state from (1), but "skipped forward" through any skip slots. Previously, Lighthouse would choose (2). However, we can see that when [Teku generates that event](https://github.com/ConsenSys/teku/blob/de2b2801c89ef5abf983d6bf37867c37fc47121f/data/beaconrestapi/src/main/java/tech/pegasys/teku/beaconrestapi/handlers/v1/events/EventSubscriptionManager.java#L171-L182) it uses [`getStateRootFromBlockRoot`](https://github.com/ConsenSys/teku/blob/de2b2801c89ef5abf983d6bf37867c37fc47121f/data/provider/src/main/java/tech/pegasys/teku/api/ChainDataProvider.java#L336-L341) which uses (1). I have switched Lighthouse from (2) to (1). I think it's a somewhat arbitrary choice between the two, where (1) is easier to compute and is consistent with Teku. ## Notes for Reviewers I've renamed `BeaconChain::fork_choice` to `BeaconChain::recompute_head`. Doing this helped ensure I broke all previous uses of fork choice and I also find it more descriptive. It describes an action and can't be confused with trying to get a reference to the `ForkChoice` struct. I've changed the ordering of SSE events when a block is received. It used to be `[block, finalized, head]` and now it's `[block, head, finalized]`. It was easier this way and I don't think we were making any promises about SSE event ordering so it's not "breaking". I've made it so fork choice will run when it's first constructed. I did this because I wanted to have a cached version of the last call to `get_head`. Ensuring `get_head` has been run *at least once* means that the cached values doesn't need to wrapped in an `Option`. This was fairly simple, it just involved passing a `slot` to the constructor so it knows *when* it's being run. When loading a fork choice from the store and a slot clock isn't handy I've just used the `slot` that was saved in the `fork_choice_store`. That seems like it would be a faithful representation of the slot when we saved it. I added the `genesis_time: u64` to the `BeaconChain`. It's small, constant and nice to have around. Since we're using FC for the fin/just checkpoints, we no longer get the `0x00..00` roots at genesis. You can see I had to remove a work-around in `ef-tests` here: b56be3bc2. I can't find any reason why this would be an issue, if anything I think it'll be better since the genesis-alias has caught us out a few times (0x00..00 isn't actually a real root). Edit: I did find a case where the `network` expected the 0x00..00 alias and patched it here: 3f26ac3e2. You'll notice a lot of changes in tests. Generally, tests should be functionally equivalent. Here are the things creating the most diff-noise in tests: - Changing tests to be `tokio::async` tests. - Adding `.await` to fork choice, block processing and block production functions. - Refactor of the `canonical_head` "API" provided by the `BeaconChain`. E.g., `chain.canonical_head.cached_head()` instead of `chain.canonical_head.read()`. - Wrapping `SignedBeaconBlock` in an `Arc`. - In the `beacon_chain/tests/block_verification`, we can't use the `lazy_static` `CHAIN_SEGMENT` variable anymore since it's generated with an async function. We just generate it in each test, not so efficient but hopefully insignificant. I had to disable `rayon` concurrent tests in the `fork_choice` tests. This is because the use of `rayon` and `block_on` was causing a panic. Co-authored-by: Mac L <mjladson@pm.me>
2022-07-03 05:36:50 +00:00
self.runtime
.handle()
.block_on(async { self.perform_tests().await });
}
pub async fn wait_until_synced(&self) {
let start_instant = Instant::now();
for pair in [&self.ee_a, &self.ee_b] {
loop {
// Run the routine to check for online nodes.
pair.execution_layer.watchdog_task().await;
if pair.execution_layer.is_synced().await {
break;
} else if start_instant + EXECUTION_ENGINE_START_TIMEOUT > Instant::now() {
sleep(Duration::from_millis(500)).await;
} else {
panic!("timeout waiting for execution engines to come online")
}
}
}
}
pub async fn perform_tests(&self) {
self.wait_until_synced().await;
// Import and unlock all private keys to sign transactions
let _ = futures::future::join_all([&self.ee_a, &self.ee_b].iter().map(|ee| {
import_and_unlock(
ee.execution_engine.http_url(),
&PRIVATE_KEYS,
KEYSTORE_PASSWORD,
)
}))
.await;
// We hardcode the accounts here since some EEs start with a default unlocked account
let account1 = ethers_core::types::Address::from_slice(&hex::decode(&ACCOUNT1).unwrap());
let account2 = ethers_core::types::Address::from_slice(&hex::decode(&ACCOUNT2).unwrap());
/*
* Check the transition config endpoint.
*/
for ee in [&self.ee_a, &self.ee_b] {
ee.execution_layer
.exchange_transition_configuration(&self.spec)
.await
.unwrap();
}
/*
* Read the terminal block hash from both pairs, check it's equal.
*/
let terminal_pow_block_hash = self
.ee_a
.execution_layer
.get_terminal_pow_block_hash(&self.spec, timestamp_now())
.await
.unwrap()
.unwrap();
assert_eq!(
terminal_pow_block_hash,
self.ee_b
.execution_layer
.get_terminal_pow_block_hash(&self.spec, timestamp_now())
.await
.unwrap()
.unwrap()
);
// Submit transactions before getting payload
let txs = transactions::<MainnetEthSpec>(account1, account2);
let mut pending_txs = Vec::new();
for tx in txs.clone().into_iter() {
let pending_tx = self
.ee_a
.execution_engine
.provider
.send_transaction(tx, None)
.await
.unwrap();
pending_txs.push(pending_tx);
}
/*
* Execution Engine A:
*
* Produce a valid payload atop the terminal block.
*/
let parent_hash = terminal_pow_block_hash;
let timestamp = timestamp_now();
Rename random to prev_randao (#3040) ## Issue Addressed As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec). Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in: - https://github.com/ethereum/execution-apis/pull/180 - https://github.com/ethereum/consensus-specs/pull/2835 With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things. We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup. ### Changes to exec integration tests There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue. Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients. ## More Info - [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
2022-03-03 02:10:57 +00:00
let prev_randao = Hash256::zero();
let head_root = Hash256::zero();
let justified_block_hash = ExecutionBlockHash::zero();
let finalized_block_hash = ExecutionBlockHash::zero();
let forkchoice_update_params = ForkchoiceUpdateParameters {
head_root,
head_hash: Some(parent_hash),
justified_hash: Some(justified_block_hash),
finalized_hash: Some(finalized_block_hash),
};
let proposer_index = 0;
let prepared = self
.ee_a
.execution_layer
.insert_proposer(
Slot::new(1), // Insert proposer for the next slot
head_root,
proposer_index,
PayloadAttributes {
timestamp,
prev_randao,
suggested_fee_recipient: Address::zero(),
},
)
.await;
assert!(!prepared, "Inserting proposer for the first time");
// Make a fcu call with the PayloadAttributes that we inserted previously
let prepare = self
.ee_a
.execution_layer
.notify_forkchoice_updated(
parent_hash,
justified_block_hash,
finalized_block_hash,
Slot::new(0),
Hash256::zero(),
)
.await
.unwrap();
assert_eq!(prepare, PayloadStatus::Valid);
// Add a delay to give the EE sufficient time to pack the
// submitted transactions into a payload.
// This is required when running on under resourced nodes and
// in CI.
sleep(Duration::from_secs(3)).await;
let builder_params = BuilderParams {
pubkey: PublicKeyBytes::empty(),
slot: Slot::new(0),
chain_health: ChainHealth::Healthy,
};
let valid_payload = self
.ee_a
.execution_layer
.get_payload::<FullPayload<MainnetEthSpec>>(
parent_hash,
timestamp,
Rename random to prev_randao (#3040) ## Issue Addressed As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec). Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in: - https://github.com/ethereum/execution-apis/pull/180 - https://github.com/ethereum/consensus-specs/pull/2835 With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things. We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup. ### Changes to exec integration tests There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue. Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients. ## More Info - [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
2022-03-03 02:10:57 +00:00
prev_randao,
proposer_index,
forkchoice_update_params,
builder_params,
&self.spec,
)
.await
.unwrap()
.execution_payload;
/*
* Execution Engine A:
*
* Indicate that the payload is the head of the chain, before submitting a
* `notify_new_payload`.
*/
let head_block_hash = valid_payload.block_hash;
let finalized_block_hash = ExecutionBlockHash::zero();
let slot = Slot::new(42);
let head_block_root = Hash256::repeat_byte(42);
let status = self
.ee_a
.execution_layer
.notify_forkchoice_updated(
head_block_hash,
justified_block_hash,
finalized_block_hash,
slot,
head_block_root,
)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Syncing);
/*
* Execution Engine A:
*
* Provide the valid payload back to the EE again.
*/
let status = self
.ee_a
.execution_layer
.notify_new_payload(&valid_payload)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
check_payload_reconstruction(&self.ee_a, &valid_payload).await;
/*
* Execution Engine A:
*
* Indicate that the payload is the head of the chain.
*
* Do not provide payload attributes (we'll test that later).
*/
let head_block_hash = valid_payload.block_hash;
let finalized_block_hash = ExecutionBlockHash::zero();
let slot = Slot::new(42);
let head_block_root = Hash256::repeat_byte(42);
let status = self
.ee_a
.execution_layer
.notify_forkchoice_updated(
head_block_hash,
justified_block_hash,
finalized_block_hash,
slot,
head_block_root,
)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
assert_eq!(valid_payload.transactions.len(), pending_txs.len());
// Verify that all submitted txs were successful
for pending_tx in pending_txs {
let tx_receipt = pending_tx.await.unwrap().unwrap();
assert_eq!(
tx_receipt.status,
Some(1.into()),
"Tx index {} has invalid status ",
tx_receipt.transaction_index
);
}
/*
* Execution Engine A:
*
* Provide an invalidated payload to the EE.
*/
let mut invalid_payload = valid_payload.clone();
Rename random to prev_randao (#3040) ## Issue Addressed As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec). Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in: - https://github.com/ethereum/execution-apis/pull/180 - https://github.com/ethereum/consensus-specs/pull/2835 With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things. We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup. ### Changes to exec integration tests There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue. Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients. ## More Info - [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
2022-03-03 02:10:57 +00:00
invalid_payload.prev_randao = Hash256::from_low_u64_be(42);
let status = self
.ee_a
.execution_layer
.notify_new_payload(&invalid_payload)
.await
.unwrap();
assert!(matches!(status, PayloadStatus::InvalidBlockHash { .. }));
/*
* Execution Engine A:
*
* Produce another payload atop the previous one.
*/
let parent_hash = valid_payload.block_hash;
let timestamp = valid_payload.timestamp + 1;
Rename random to prev_randao (#3040) ## Issue Addressed As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec). Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in: - https://github.com/ethereum/execution-apis/pull/180 - https://github.com/ethereum/consensus-specs/pull/2835 With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things. We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup. ### Changes to exec integration tests There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue. Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients. ## More Info - [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
2022-03-03 02:10:57 +00:00
let prev_randao = Hash256::zero();
let proposer_index = 0;
let builder_params = BuilderParams {
pubkey: PublicKeyBytes::empty(),
slot: Slot::new(0),
chain_health: ChainHealth::Healthy,
};
let second_payload = self
.ee_a
.execution_layer
.get_payload::<FullPayload<MainnetEthSpec>>(
parent_hash,
timestamp,
Rename random to prev_randao (#3040) ## Issue Addressed As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec). Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in: - https://github.com/ethereum/execution-apis/pull/180 - https://github.com/ethereum/consensus-specs/pull/2835 With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things. We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup. ### Changes to exec integration tests There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue. Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients. ## More Info - [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
2022-03-03 02:10:57 +00:00
prev_randao,
proposer_index,
forkchoice_update_params,
builder_params,
&self.spec,
)
.await
.unwrap()
.execution_payload;
/*
* Execution Engine A:
*
* Provide the second payload back to the EE again.
*/
let status = self
.ee_a
.execution_layer
.notify_new_payload(&second_payload)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
check_payload_reconstruction(&self.ee_a, &second_payload).await;
/*
* Execution Engine A:
*
* Indicate that the payload is the head of the chain, providing payload attributes.
*/
let head_block_hash = valid_payload.block_hash;
let finalized_block_hash = ExecutionBlockHash::zero();
let payload_attributes = PayloadAttributes {
timestamp: second_payload.timestamp + 1,
Rename random to prev_randao (#3040) ## Issue Addressed As discussed on last-night's consensus call, the testnets next week will target the [Kiln Spec v2](https://hackmd.io/@n0ble/kiln-spec). Presently, we support Kiln V1. V2 is backwards compatible, except for renaming `random` to `prev_randao` in: - https://github.com/ethereum/execution-apis/pull/180 - https://github.com/ethereum/consensus-specs/pull/2835 With this PR we'll no longer be compatible with the existing Kintsugi and Kiln testnets, however we'll be ready for the testnets next week. I raised this breaking change in the call last night, we are all keen to move forward and break things. We now target the [`merge-kiln-v2`](https://github.com/MariusVanDerWijden/go-ethereum/tree/merge-kiln-v2) branch for interop with Geth. This required adding the `--http.aauthport` to the tester to avoid a port conflict at startup. ### Changes to exec integration tests There's some change in the `merge-kiln-v2` version of Geth that means it can't compile on a vanilla Github runner. Bumping the `go` version on the runner solved this issue. Whilst addressing this, I refactored the `testing/execution_integration` crate to be a *binary* rather than a *library* with tests. This means that we don't need to run the `build.rs` and build Geth whenever someone runs `make lint` or `make test-release`. This is nice for everyday users, but it's also nice for CI so that we can have a specific runner for these tests and we don't need to ensure *all* runners support everything required to build all execution clients. ## More Info - [x] ~~EF tests are failing since the rename has broken some tests that reference the old field name. I have been told there will be new tests released in the coming days (25/02/22 or 26/02/22).~~
2022-03-03 02:10:57 +00:00
prev_randao: Hash256::zero(),
suggested_fee_recipient: Address::zero(),
};
let slot = Slot::new(42);
let head_block_root = Hash256::repeat_byte(100);
let validator_index = 0;
self.ee_a
.execution_layer
.insert_proposer(slot, head_block_root, validator_index, payload_attributes)
.await;
let status = self
.ee_a
.execution_layer
.notify_forkchoice_updated(
head_block_hash,
justified_block_hash,
finalized_block_hash,
slot,
head_block_root,
)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
/*
* Execution Engine B:
*
* Provide the second payload, without providing the first.
*/
let status = self
.ee_b
.execution_layer
.notify_new_payload(&second_payload)
.await
.unwrap();
// TODO: we should remove the `Accepted` status here once Geth fixes it
assert!(matches!(
status,
PayloadStatus::Syncing | PayloadStatus::Accepted
));
/*
* Execution Engine B:
*
* Set the second payload as the head, without providing payload attributes.
*/
let head_block_hash = second_payload.block_hash;
let finalized_block_hash = ExecutionBlockHash::zero();
let slot = Slot::new(42);
let head_block_root = Hash256::repeat_byte(42);
let status = self
.ee_b
.execution_layer
.notify_forkchoice_updated(
head_block_hash,
justified_block_hash,
finalized_block_hash,
slot,
head_block_root,
)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Syncing);
/*
* Execution Engine B:
*
* Provide the first payload to the EE.
*/
let status = self
.ee_b
.execution_layer
.notify_new_payload(&valid_payload)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
check_payload_reconstruction(&self.ee_b, &valid_payload).await;
/*
* Execution Engine B:
*
* Provide the second payload, now the first has been provided.
*/
let status = self
.ee_b
.execution_layer
.notify_new_payload(&second_payload)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
check_payload_reconstruction(&self.ee_b, &second_payload).await;
/*
* Execution Engine B:
*
* Set the second payload as the head, without providing payload attributes.
*/
let head_block_hash = second_payload.block_hash;
let finalized_block_hash = ExecutionBlockHash::zero();
let slot = Slot::new(42);
let head_block_root = Hash256::repeat_byte(42);
let status = self
.ee_b
.execution_layer
.notify_forkchoice_updated(
head_block_hash,
justified_block_hash,
finalized_block_hash,
slot,
head_block_root,
)
.await
.unwrap();
assert_eq!(status, PayloadStatus::Valid);
}
}
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
/// Check that the given payload can be re-constructed by fetching it from the EE.
///
/// Panic if payload reconstruction fails.
async fn check_payload_reconstruction<E: GenericExecutionEngine>(
ee: &ExecutionPair<E, MainnetEthSpec>,
Separate execution payloads in the DB (#3157) ## Proposed Changes Reduce post-merge disk usage by not storing finalized execution payloads in Lighthouse's database. :warning: **This is achieved in a backwards-incompatible way for networks that have already merged** :warning:. Kiln users and shadow fork enjoyers will be unable to downgrade after running the code from this PR. The upgrade migration may take several minutes to run, and can't be aborted after it begins. The main changes are: - New column in the database called `ExecPayload`, keyed by beacon block root. - The `BeaconBlock` column now stores blinded blocks only. - Lots of places that previously used full blocks now use blinded blocks, e.g. analytics APIs, block replay in the DB, etc. - On finalization: - `prune_abanonded_forks` deletes non-canonical payloads whilst deleting non-canonical blocks. - `migrate_db` deletes finalized canonical payloads whilst deleting finalized states. - Conversions between blinded and full blocks are implemented in a compositional way, duplicating some work from Sean's PR #3134. - The execution layer has a new `get_payload_by_block_hash` method that reconstructs a payload using the EE's `eth_getBlockByHash` call. - I've tested manually that it works on Kiln, using Geth and Nethermind. - This isn't necessarily the most efficient method, and new engine APIs are being discussed to improve this: https://github.com/ethereum/execution-apis/pull/146. - We're depending on the `ethers` master branch, due to lots of recent changes. We're also using a workaround for https://github.com/gakonst/ethers-rs/issues/1134. - Payload reconstruction is used in the HTTP API via `BeaconChain::get_block`, which is now `async`. Due to the `async` fn, the `blocking_json` wrapper has been removed. - Payload reconstruction is used in network RPC to serve blocks-by-{root,range} responses. Here the `async` adjustment is messier, although I think I've managed to come up with a reasonable compromise: the handlers take the `SendOnDrop` by value so that they can drop it on _task completion_ (after the `fn` returns). Still, this is introducing disk reads onto core executor threads, which may have a negative performance impact (thoughts appreciated). ## Additional Info - [x] For performance it would be great to remove the cloning of full blocks when converting them to blinded blocks to write to disk. I'm going to experiment with a `put_block` API that takes the block by value, breaks it into a blinded block and a payload, stores the blinded block, and then re-assembles the full block for the caller. - [x] We should measure the latency of blocks-by-root and blocks-by-range responses. - [x] We should add integration tests that stress the payload reconstruction (basic tests done, issue for more extensive tests: https://github.com/sigp/lighthouse/issues/3159) - [x] We should (manually) test the schema v9 migration from several prior versions, particularly as blocks have changed on disk and some migrations rely on being able to load blocks. Co-authored-by: Paul Hauner <paul@paulhauner.com>
2022-05-12 00:42:17 +00:00
payload: &ExecutionPayload<MainnetEthSpec>,
) {
let reconstructed = ee
.execution_layer
.get_payload_by_block_hash(payload.block_hash)
.await
.unwrap()
.unwrap();
assert_eq!(reconstructed, *payload);
}
/// Returns the duration since the unix epoch.
pub fn timestamp_now() -> u64 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap_or_else(|_| Duration::from_secs(0))
.as_secs()
}