docs: ADR-067 Simulator v2 (#16400)

2023-06-23 15:59:41 -04:00 · 2023-06-23 15:59:41 -04:00 · 0edd9a3ef6
commit 0edd9a3ef6
parent c82f64922e
4 changed files with 200 additions and 5 deletions
--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@ -51,9 +51,11 @@ When writing ADRs, follow the same best practices for writing RFCs. When writing
 * [ADR 020: Protocol Buffer Transaction Encoding](./adr-020-protobuf-transaction-encoding.md)
 * [ADR 021: Protocol Buffer Query Encoding](./adr-021-protobuf-query-encoding.md)
 * [ADR 023: Protocol Buffer Naming and Versioning](./adr-023-protobuf-naming.md)
+* [ADR 024: Coin Metadata](./adr-024-coin-metadata.md)
 * [ADR 029: Fee Grant Module](./adr-029-fee-grant-module.md)
 * [ADR 030: Message Authorization Module](./adr-030-authz-module.md)
 * [ADR 031: Protobuf Msg Services](./adr-031-msg-service.md)
+* [ADR 046: Module Params](./adr-046-module-params.md)
 * [ADR 055: ORM](./adr-055-orm.md)
 * [ADR 058: Auto-Generated CLI](./adr-058-auto-generated-cli.md)
 * [ADR 060: ABCI 1.0 (Phase I)](adr-060-abci-1.0.md)
@ -69,7 +71,6 @@ When writing ADRs, follow the same best practices for writing RFCs. When writing
 * [ADR 017: Historical Header Module](./adr-017-historical-header-module.md)
 * [ADR 018: Extendable Voting Periods](./adr-018-extendable-voting-period.md)
 * [ADR 022: Custom baseapp panic handling](./adr-022-custom-panic-handling.md)
-* [ADR 024: Coin Metadata](./adr-024-coin-metadata.md)
 * [ADR 027: Deterministic Protobuf Serialization](./adr-027-deterministic-protobuf-serialization.md)
 * [ADR 028: Public Key Addresses](./adr-028-public-key-addresses.md)
 * [ADR 032: Typed Events](./adr-032-typed-events.md)
@ -79,13 +80,13 @@ When writing ADRs, follow the same best practices for writing RFCs. When writing
 * [ADR 038: State Listening](./adr-038-state-listening.md)
 * [ADR 039: Epoched Staking](./adr-039-epoched-staking.md)
 * [ADR 040: Storage and SMT State Commitments](./adr-040-storage-and-smt-state-commitments.md)
-* [ADR 046: Module Params](./adr-046-module-params.md)
 * [ADR 054: Semver Compatible SDK Modules](./adr-054-semver-compatible-modules.md)
 * [ADR 057: App Wiring](./adr-057-app-wiring.md)
 * [ADR 059: Test Scopes](./adr-059-test-scopes.md)
 * [ADR 062: Collections State Layer](./adr-062-collections-state-layer.md)
 * [ADR 063: Core Module API](./adr-063-core-module-api.md)
-* [ADR 065: Store V2](./adr-065-store-v2.md)
+* [ADR 065: Store v2](./adr-065-store-v2.md)
+* [ADR 067: Simulator v2](./adr-067-simulator-v2.md)

 ### Draft

--- a/docs/architecture/adr-024-coin-metadata.md
+++ b/docs/architecture/adr-024-coin-metadata.md
@ -6,7 +6,7 @@

 ## Status

-Proposed
+ACCEPTED

 ## Context

--- a/docs/architecture/adr-046-module-params.md
+++ b/docs/architecture/adr-046-module-params.md
@ -6,7 +6,7 @@

 ## Status

-Proposed
+ACCEPTED

 ## Abstract

--- a/docs/architecture/adr-067-simulator-v2.md
+++ b/docs/architecture/adr-067-simulator-v2.md
@ -0,0 +1,194 @@
+# ADR 067: Simulator v2
+
+## Changelog
+
+* June 01, 2023: Initial Draft (@alexanderbez)
+
+## Status
+
+DRAFT
+
+## Abstract
+
+The Cosmos SDK simulator is a tool that allows developers to test the entirety
+of their application's state machine through the use of pseudo-randomized "operations",
+which represent transactions. The simulator also provides primitives that ensures
+there are no non-determinism issues and that the application's state machine can
+be successfully exported and imported using randomized state.
+
+The simulator has played an absolutely critical role in the development and testing
+of the Cosmos Hub and all the releases of the Cosmos SDK after the launch of the
+Cosmos Hub. Since the Hub, the simulator has relatively not changed much, so it's
+overdue for a revamp.
+
+## Context
+
+The current simulator, `x/simulation`, acts as a semi-fuzz testing suite that takes
+in an integer that represents a seed into a PRNG. The PRNG is used to generate a
+sequence of "operations" that are meant to reflect transactions that an application's
+state machine can process. Through the use of the PRNG, all aspects of block production
+and consumption are randomized. This includes a block's proposer, the validators
+who both sign and miss the block, along with the transaction operations themselves.
+
+Each Cosmos SDK module defines a set of simulation operations that _attempt_ to
+produce valid transactions, e.g. `x/distribution/simulation/operations.go`. These
+operations can sometimes fail depending on the accumulated state of the application
+within that simulation run. The simulator will continue to generate operations
+until it has reached a certain number of operations or until it has reached a
+fatal state, reporting results. This gives the ability for application developers
+to reliably execute full range application simulation and fuzz testing against
+their application.
+
+However, there are a few major drawbacks. Namely, with the advent of ABCI++, specifically
+`FinalizeBlock`, the internal workings of the simulator no longer comply with how
+an application would actually perform. Specifically, operations are executed
+_after_ `FinalizeBlock`, whereas they should be executed _within_ `FinalizeBlock`.
+
+Additionally, the simulator is not very extensible. Developers should be able to
+easily define and extend the following:
+
+* Consistency or validity predicates (what are known as invariants today)
+* Property tests of state before and after a block is simulated
+
+In addition, we also want to achieve the following:
+
+* Consolidated weight management, i.e. define weights within the simulator itself
+  via a config and not defined in each module
+* Observability of the simulator's execution, i.e. have easy to understand output/logs
+  with the ability to pipe those logs into some external sink
+* Smart replay, i.e. the ability to not only rerun a simulation from a seed, but
+  also the ability to replay from an arbitrary breakpoint
+* Run a simulation based off of real network state
+
+## Decision
+
+Instead of refactoring the existing simulator, `x/simulation`, we propose to create
+a new package in the root of the Cosmos SDK, `simulator`, that will be the new
+simulation framework. The simulator will more accurately reflect the complete
+lifecycle of an ABCI application.
+
+Specifically, we propose a similar implementation and use of a `simulator.Manager`,
+that exists today, that is responsible for managing the execution of a simulation.
+The manager will wrap an ABCI application and will be responsible for the following:
+
+* Populating the application's mempool with a set of pseudo-random transactions
+  before each block, some of which may contain invalid messages.
+* Selecting transactions and a random proposer to execute `PrepareProposal`.
+* Executing `ProcessProposal`, `FinalizeBlock` and `Commit`.
+* Executing a set of validity predicates before and after each block.
+* Maintaining a CPU and RAM profile of the simulation execution.
+* Allowing a simulation to stop and resume from a given block height.
+* Simulation liveness of each validator per-block.
+
+From an application developer's perspective, they will only need to provide the
+modules to be used in the simulator and the manager will take care of the rest.
+In addition, they will not need to write their own simulation test(s), e.g.
+non-determinism, multi-seed, etc..., as the manager will provide these as well.
+
+```go
+type Manager struct {
+  app     sdk.Application
+  mempool sdk.Mempool
+  rng     rand.Rand
+  // ...
+}
+```
+
+### Configuration
+
+The simulator's testing input will be driven by a configuration file, as opposed
+to CLI arguments. This will allow for more extensibility and ease of use along with
+the ability to have shared configuration files across multiple simulations.
+
+### Execution
+
+As alluded to previously, after the execution of each block, the manager will
+generate a series of pseudo-random transactions and attempt to insert them into
+the mempool via `BaseApp#CheckTx`. During the ABCI lifecycle of a block, this
+mempool will be used to seed the transactions into a block proposal as it would
+in a real network. This allows us to not only test the state machine, but also
+test the ABCI lifecycle of a block.
+
+Statistics, such as total blocks and total failed proposals, will be collected,
+logged and written to output after the full or partial execution of a simulation.
+The output destination of these statistics will be configurable.
+
+```go
+func (s *Simulator) SimulateBlock() {
+  rProposer := s.SelectRandomProposer()
+  rTxs := s.SelectTxs()
+
+  prepareResp, err := s.app.PrepareProposal(&abci.RequestPrepareProposal{Txs: rTxs})
+  // handle error
+
+  processResp, err := s.app.ProcessProposal(&abci.RequestProcessProposal{
+    Txs: prepareResp.Txs,
+    // ...
+  })
+  // handle error
+
+  // execute liveness matrix...
+
+  _, err = s.app.FinalizeBlock(...)
+  // handle error
+  
+  _, err = s.app.Commit(...)
+  // handle error
+}
+```
+
+Note, some application do not define or need their own app-side mempool, so we
+propose that `SelectTxs` mimic CometBFT and just return FIFO-ordered transactions
+from an ad-hoc simulator mempool. In the case where an application does define
+its own mempool, it will simply ignore what is provided in `RequestPrepareProposal`.
+
+### Profiling
+
+The manager will be responsible for collecting CPU and RAM profiles of the simulation
+execution. We propose to use [Pyroscope](https://pyroscope.io/docs/golang/) to
+capture profiles and export them to a local file and via an HTTP endpoint. This
+can be disabled or enabled by configuration.
+
+### Breakpoints
+
+Via configuration, a caller can express a height-based breakpoint that will allow
+the simulation to stop and resume from a given height. This will allow for debugging
+of CPU, RAM, and state.
+
+### Validity Predicates
+
+We propose to provide the ability for an application to provide the simulator a
+set of validity predicates, i.e. invariant checkers, that will be executed before
+and after each block. This will allow for the application to assert that certain
+state invariants are held before and after each block. Note, as a consequence of
+this, we propose to remove the existing notion of invariants from module production
+execution paths and deprecate their usage all together.
+
+```go
+type Manager struct {
+  // ...
+  preBlockValidator   func(sdk.Context) error
+  postBlockValidator  func(sdk.Context) error
+}
+```
+
+## Consequences
+
+### Backwards Compatibility
+
+The new simulator package will not naturally not be backwards compatible with the
+existing `x/simulation` module. However, modules will still be responsible for
+providing pseudo-random transactions to the simulator.
+
+### Positive
+
+* Providing more intuitive and cleaner APIs for application developers
+* More closely resembling the true lifecycle of an ABCI application
+
+### Negative
+
+* Breaking current Cosmos SDK module APIs for transaction generation
+
+## References
+
+* [Osmosis Simulation ADR](https://github.com/osmosis-labs/osmosis/blob/main/simulation/ADR.md)