It turns out that encoding json.RawMessage is slow because
package json basically parses the message again to ensure it is valid.
We can avoid the slowdown by encoding the entire RPC notification once,
which yields a 30% speedup.
* rpc: make subscription test faster
reduces time for TestClientSubscriptionChannelClose
from 25 sec to < 1 sec.
* trie: cache trie nodes for faster sanity check
This reduces the time spent on TestIncompleteSyncHash
from ~25s to ~16s.
* core/forkid: speed up validation test
This takes the validation test from > 5s to sub 1 sec
* core/state: improve snapshot test run
brings the time for TestSnapshotRandom from 13s down to 6s
* accounts/keystore: improve keyfile test
This removes some unnecessary waits and reduces the
runtime of TestUpdatedKeyfileContents from 5 to 3 seconds
* trie: remove resolver
* trie: only check ~5% of all trie nodes
The String() version of BlockNumberOrHash uses decimal for all block numbers, including negative ones used to indicate labels. Switch to using BlockNumber.String() which encodes it correctly for use in the JSON-RPC API.
We're trying a new named pipe library, which should hopefully fix some occasional failures in CI.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This should fix#27726. With enough load, it might happen that the SetPongHandler
callback gets invoked before the call to SetReadDeadline is made in pingLoop. When
this occurs, the socket will end up with a 30s read deadline even though it got the pong,
which will lead to a timeout.
The fix here is processing the pong on pingLoop, synchronizing with the code that
sends the ping.
Package rpc uses cgo to find the maximum UNIX domain socket path
length. If exceeded, a warning is printed. This is the only use of cgo in this
package. It seems excessive to depend on cgo just for this warning, so
we now hard-code the usual limit for Linux instead.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This adds two ways to check for subscription support. First, one can now check
whether the transport method (HTTP/WS/etc.) is capable of subscriptions using
the new Client.SupportsSubscriptions method.
Second, the error returned by Subscribe can now reliably be tested using this
pattern:
sub, err := client.Subscribe(...)
if errors.Is(err, rpc.ErrNotificationsUnsupported) {
// no subscription support
}
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR adds server-side limits for JSON-RPC batch requests. Before this change, batches
were limited only by processing time. The server would pick calls from the batch and
answer them until the response timeout occurred, then stop processing the remaining batch
items.
Here, we are adding two additional limits which can be configured:
- the 'item limit': batches can have at most N items
- the 'response size limit': batches can contain at most X response bytes
These limits are optional in package rpc. In Geth, we set a default limit of 1000 items
and 25MB response size.
When a batch goes over the limit, an error response is returned to the client. However,
doing this correctly isn't always possible. In JSON-RPC, only method calls with a valid
`id` can be responded to. Since batches may also contain non-call messages or
notifications, the best effort thing we can do to report an error with the batch itself is
reporting the limit violation as an error for the first method call in the batch. If a batch is
too large, but contains only notifications and responses, the error will be reported with
a null `id`.
The RPC client was also changed so it can deal with errors resulting from too large
batches. An older client connected to the server code in this PR could get stuck
until the request timeout occurred when the batch is too large. **Upgrading to a version
of the RPC client containing this change is strongly recommended to avoid timeout issues.**
For some weird reason, when writing the original client implementation, @fjl worked off of
the assumption that responses could be distributed across batches arbitrarily. So for a
batch request containing requests `[A B C]`, the server could respond with `[A B C]` but
also with `[A B] [C]` or even `[A] [B] [C]` and it wouldn't make a difference to the
client.
So in the implementation of BatchCallContext, the client waited for all requests in the
batch individually. If the server didn't respond to some of the requests in the batch, the
client would eventually just time out (if a context was used).
With the addition of batch limits into the server, we anticipate that people will hit this
kind of error way more often. To handle this properly, the client now waits for a single
response batch and expects it to contain all responses to the requests.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
ethclient accepts certain negative block number values as specifiers for the "pending",
"safe" and "finalized" block. In case of "pending", the value accepted by ethclient (-1)
did not match rpc.PendingBlockNumber (-2).
This wasn't really a problem, but other values accepted by ethclient did match the
definitions in package rpc, and it's weird to have this one special case where they don't.
To fix it, we decided to change the values of the constants rather than changing ethclient.
The constant values are not otherwise significant. This is a breaking API change, but we
believe not a dangerous one.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This changes the RPC server to ignore methods using *context.Context as parameter
and *error as return value type. Methods with such types would crash the server when
called.
The change fixes unmarshaling of JSON null results into json.RawMessage.
---------
Co-authored-by: Jason Yuan <jason.yuan@curvegrid.com>
Co-authored-by: Jason Yuan <jason.yuan869@gmail.com>
This change fixes a minor flaw in the check for ipc endpoint length. The max_path_size is the max path that an ipc endpoint can have, which is 208. However, that size concerns the null-terminated pathname, so we need to account for an extra null-character too.
Here we add special handling for sending an error response when the write timeout of the
HTTP server is just about to expire. This is surprisingly difficult to get right, since is
must be ensured that all output is fully flushed in time, which needs support from
multiple levels of the RPC handler stack:
The timeout response can't use chunked transfer-encoding because there is no way to write
the final terminating chunk. net/http writes it when the topmost handler returns, but the
timeout will already be over by the time that happens. We decided to disable chunked
encoding by setting content-length explicitly.
Gzip compression must also be disabled for timeout responses because we don't know the
true content-length before compressing all output, i.e. compression would reintroduce
chunked transfer-encoding.
This removes an RPC test which takes > 90s to execute, and updates the
internal/guide tests to use lighter scrypt parameters.
Co-authored-by: Felix Lange <fjl@twurst.com>
This adds a way to specify HTTP headers per request.
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Felix Lange <fjl@twurst.com>
rpc: fix connection tracking in Server
When upgrading to mapset/v2 with generics, the set element type used in
rpc.Server had to be changed to *ServerCodec because ServerCodec is not
'comparable'. While the distinction is technically correct, we know all
possible ServerCodec types, and all of them are comparable. So just use
a map instead.
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.
In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.
With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
This changes the error code returned by the RPC server in certain situations:
- handler panic: code -32603
- result marshaling error: code -32603
- attempt to subscribe via HTTP: code -32001
In all of the above cases, the server previously returned the default error
code -32000.
Co-authored-by: Nicholas Zhao <nicholas.zhao@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
The JSON-RPC spec requires the "version" field to be exactly "2.0",
so we should verify that. This change is not backwards-compatible with
sloppy client implementations, but I decided to go ahead with it anyway
because the failure will be caught via the returned error.
This adds a generic mechanism for 'dial options' in the RPC client,
and also implements a specific dial option for the JWT authentication
mechanism used by the engine API. Some real tests for the server-side
authentication handling are also added.
Co-authored-by: Joshua Gutow <jgutow@optimism.io>
Co-authored-by: Felix Lange <fjl@twurst.com>
This change makes http.Server.ReadHeaderTimeout configurable separately
from ReadTimeout for RPC servers. The default is set to the same as
ReadTimeout, which in order to cause no change in existing deployments.
This enables the following linters
- typecheck
- unused
- staticcheck
- bidichk
- durationcheck
- exportloopref
- gosec
WIth a few exceptions.
- We use a deprecated protobuf in trezor. I didn't want to mess with that, since I cannot meaningfully test any changes there.
- The deprecated TypeMux is used in a few places still, so the warning for it is silenced for now.
- Using string type in context.WithValue is apparently wrong, one should use a custom type, to prevent collisions between different places in the hierarchy of callers. That should be fixed at some point, but may require some attention.
- The warnings for using weak random generator are squashed, since we use a lot of random without need for cryptographic guarantees.