Update eth_writeStateDiffAt #296

Closed
opened 2023-01-04 16:25:23 +00:00 by i-norden · 2 comments
Member

eth_writeStateDiffAt allows an external service to tell geth to write a statediff at a specific block(hash). This process can be long running, especially if the geth node is busy writing statediffs as it tracks the head of the chain.

This is problematic for two reasons:

  1. The request may not return before the default timneout of 10s
  2. Multiple requests may pile up on the geth RPC server, causing resource issues for the Postgres server being written to

Quick fixes:

  1. Convert to long-running subscription endpoint
  2. Adjust deadline
  3. Adjust unary endpoint so that it returns immediately with a processID that can be used to check status of job at later time

In each of the above cases we also need to add constraints on the number of concurrent requests being served.

`eth_writeStateDiffAt` allows an external service to tell geth to write a statediff at a specific block(hash). This process can be long running, especially if the geth node is busy writing statediffs as it tracks the head of the chain. This is problematic for two reasons: 1. The request may not return before the [default timneout of 10s](https://github.com/cerc-io/go-ethereum/blob/v1.10.25-statediff-v4/rpc/client.go#L43) 2. Multiple requests may pile up on the geth RPC server, causing resource issues for the Postgres server being written to Quick fixes: 1. Convert to long-running subscription endpoint 2. Adjust deadline 3. Adjust unary endpoint so that it returns immediately with a processID that can be used to check status of job at later time In each of the above cases we also need to add constraints on the number of concurrent requests being served.
Member

It is quite easy to trigger associated problems by making calls to eth_getStorageAt through ipld-eth-server. In the standard production configuration, if the data cannot be loaded by ipld-eth-server it will proxy the call to geth and also spin up a go routine which calls eth_writeStateDiffAt or eth_writeStateDiffFor in the background.

This will respond to the client quickly (usually on the order of 100ms or less) while the go routine goes about its work. It isn't uncommon to request several storage items related to the same contract (for example) in quick succession. The result is triggering more and more statediffs, usually for the same block, all attempting to run concurrently. In the worst case scenario, this will exhaust available resources, eg, https://github.com/cerc-io/go-ethereum/issues/293 . In that example, all it takes is making a single eth_getStorageAt call at a time in a loop.

It is worth noting that until https://github.com/cerc-io/issue_tracking/issues/5 is fixed, there is no guarantee that the requested key exists in the DB even after the block has been statediffed "successfully". This means statediffing can be triggered on the same block again (and again) regardless of how many times it has already been processed (see the repeated writing statediff for... messages in this comment: https://github.com/cerc-io/ipld-eth-server/issues/209#issuecomment-1363314614)

It is quite easy to trigger associated problems by making calls to `eth_getStorageAt` through `ipld-eth-server`. In the standard production configuration, if the data cannot be loaded by `ipld-eth-server` it will proxy the call to `geth` and also spin up a go routine which calls `eth_writeStateDiffAt` or `eth_writeStateDiffFor` in the background. This will respond to the client quickly (usually on the order of 100ms or less) while the go routine goes about its work. It isn't uncommon to request several storage items related to the same contract (for example) in quick succession. The result is triggering more and more statediffs, usually for the same block, all attempting to run concurrently. In the worst case scenario, this will exhaust available resources, eg, https://github.com/cerc-io/go-ethereum/issues/293 . In that example, all it takes is making a single `eth_getStorageAt` call at a time in a loop. It is worth noting that until https://github.com/cerc-io/issue_tracking/issues/5 is fixed, there is no guarantee that the requested key exists in the DB even after the block has been statediffed "successfully". This means statediffing can be triggered on the same block again (and again) regardless of how many times it has already been processed (see the repeated `writing statediff for...` messages in this comment: https://github.com/cerc-io/ipld-eth-server/issues/209#issuecomment-1363314614)
Author
Member

Thanks @telackey

On the ipld-eth-server side the behavior we would like have is:

  1. Initial query comes in
  2. Cant satisfy it locally
  3. Get result from remote node
  4. Tell remote node to statediff missing data
  5. Return the result from the remote node (without waiting for result from 4)

But then if another query comes in that would trigger the same writeStateDiffAt query as in step 4 and step 4 still hasn't returned, it should not trigger that query.

For endpoints such as eth_call and eth_getStorageAt where we cannot attribute missing data to a statediff missing at a specific height, we have removed their ability to call eth_writeStateDiffAt entirely and that missing data need to be detected and handled separately as discussed elsewhere (and including the https://github.com/cerc-io/issue_tracking/issues/5 fix).

Thanks @telackey On the ipld-eth-server side the behavior we would like have is: 1. Initial query comes in 2. Cant satisfy it locally 3. Get result from remote node 4. Tell remote node to statediff missing data 5. Return the result from the remote node (without waiting for result from 4) But then if another query comes in that would trigger the same `writeStateDiffAt` query as in step 4 and step 4 still hasn't returned, it should not trigger that query. For endpoints such as `eth_call` and `eth_getStorageAt` where we cannot attribute missing data to a statediff missing at a specific height, we have removed their ability to call `eth_writeStateDiffAt` [entirely](https://github.com/cerc-io/ipld-eth-server/pull/223) and that missing data need to be detected and handled separately as discussed elsewhere (and including the https://github.com/cerc-io/issue_tracking/issues/5 fix).
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: cerc-io/go-ethereum#296
No description provided.