laconicd/docs/PATTERNS.md
Prathamesh Musale eb9aa00816 docs: guide agents/humans toward Naming API for mutable logical identity
Applications built on laconicd repeatedly pick the Record + attribute-
scan pattern for data that has mutable logical identity (game state,
user profiles, inventories). The query cost then grows linearly with
per-entity version count, and the fix reached for (cache, pagination,
compaction) hides the design mistake rather than correcting it. AI
agents hit this failure mode especially reliably: they pattern-match
the existing client's queryRecords usage as canon and never discover
the Naming API.

This commit adds four pieces of documentation that surface the right
primitive *before* a wrong one is committed to:

  * docs/PATTERNS.md — primitive decision tree with concrete
    anti-patterns (queryRecords-as-KV, compaction-for-latency,
    cache-the-slow-query) and a worked example (mutable game state).
  * AGENTS.md — explicit for AI coding agents; names the failure mode,
    lists six rules, and tells agents to read PATTERNS.md before
    writing their first queryRecords call.
  * gql/cerc-io/laconicd/schema.graphql — prescriptive comments on
    queryRecords (warning against mutable-identity usage),
    getRecordsByIds (point-lookup clarification), and the Naming
    API section (USE THIS for current-state lookups).
  * README.md — new 'Designing state that lives on laconicd' section
    between Usage and Tests, linking both new docs.
2026-04-24 16:04:31 +00:00

221 lines
8.9 KiB
Markdown

# Laconicd design patterns
**Read this before designing any state that lives on laconicd.**
Not after. Writing a client library, a service that persists user
data, or a migration from another store — start here.
This document exists because the wrong primitive choice in an
application is expensive to unwind after production traffic
accumulates. The chain is append-only; mistakes persist as
on-chain garbage you cannot delete retroactively. Most of the
architectural regret in the laconicd ecosystem so far has come
from picking `queryRecords` as the default read path when a
different primitive was the right tool.
## The primitives, honestly
Laconicd exposes three distinct storage and lookup primitives. They
look similar on first glance and have VERY different semantics.
| Primitive | Identity | Mutability | Lookup cost | Audit trail |
|---|---|---|---|---|
| **Record** (`setRecord` / `queryRecords` / `getRecordsByIds`) | Content hash (CID) | Immutable. Every change is a NEW record with a new CID. | Attribute-indexed, but returns EVERY version matching | Implicit — all versions persist forever |
| **Name** (`setName` / `lookupNames` / `resolveNames`) | Caller-chosen string (`"mtm/lootboxes/<wallet>/<id>"`) | Pointer. Re-setting updates which CID the name resolves to. | Direct name → latest CID lookup (single record) | Explicit — `NameRecord.history` returns prior bindings with block heights |
| **Authority / Bond / Auction** | Purpose-specific | Varies | Purpose-specific | Varies |
**The trap:** `Record` looks like the universal "put this thing on
chain" primitive because its API surface is biggest and most
obvious. If you treat it as such, every mutable-state use case
(game state, user profiles, inventories, counters) ends up re-
implementing the Naming API badly — appending version after
version, then filtering client-side for "the latest."
## Decision tree
```
Does the thing you're storing have a mutable logical identity?
(i.e., "the current state of thing X" is a meaningful question)
├── YES → Use the Naming API.
│ setName("mtm/lootboxes/<wallet>/<box_id>", cid) on every write.
│ lookupNames(...) or resolveNames(...) returns the one
│ current record. Audit trail via NameRecord.history.
└── NO → Is it a point-lookup by CID?
├── YES → getRecordsByIds([cid])
└── NO → It's an append-only event stream with
queryable attributes. Use setRecord +
queryRecords. Expect every historical record
to come back; plan pagination accordingly.
```
## Anti-pattern: "queryRecords-as-key-value-store"
This is the mistake real laconicd codebases keep making. It looks
like this:
```python
# BAD — reimplementing names via attribute scan
async def save_state(user_id: str, state: dict):
await set_record(
record_type="UserState",
attributes={"user_id": user_id}, # logical ID as attribute
data=state,
)
async def get_state(user_id: str) -> dict:
records = await query_records(
record_type="UserState",
attributes={"user_id": user_id},
)
# Client-side "pick the latest" — smell
records.sort(key=lambda r: r["created_at"], reverse=True)
return records[0]
```
Why it's wrong:
1. Every call to `save_state` writes a NEW content-addressed record.
Old records stay on chain forever.
2. `query_records` returns all of them every time. Per-record fetch
cost scales linearly with save count.
3. The client-side `sort + [0]` is "I wanted the latest" — exactly
what `lookupNames` does at the index level.
4. Latency degrades silently as users accumulate history. The
query eventually breaches whatever timeout the caller has.
The correct pattern:
```python
# GOOD — names for mutable identity
async def save_state(user_id: str, state: dict):
cid = await set_record(
record_type="UserState",
attributes={"user_id": user_id},
data=state,
)
await set_name(
name=f"my-app/user-state/{user_id}",
cid=cid,
)
async def get_state(user_id: str) -> dict:
record = await resolve_name(f"my-app/user-state/{user_id}")
return record
```
Properties of the correct pattern:
- One name → one current CID. Latency doesn't scale with save count.
- Historical versions still exist on chain (immutable Record CIDs
aren't deleted), and are reachable via `NameRecord.history`.
- Audit trail is explicit rather than implicit — you can ask "what
was the state at block N?" rather than scanning every version.
- Client code has no "pick the latest" step; the chain already
answered that.
## Anti-pattern: "compaction via `delete_records`"
The `delete_records` primitive exists and is tempting to use for
"prune old versions of logical entity X to keep queries fast." Do
not use it for that unless you have signed off on dropping the
audit trail.
- `delete_records` physically removes a CID from the index.
- If your design relies on every version being recoverable (audit
trails, chain-of-custody, dispute resolution), compaction
destroys exactly that property.
- If your design doesn't need every version, you were never in
the Record + attribute-scan pattern to begin with — you wanted
the Naming API from the start.
`delete_records` is appropriate for **user-intent-driven
permanent deletion** (unregistering a device, unfollowing,
revoking a credential). Not for database-compaction-style
version pruning.
## Anti-pattern: "cache the slow query"
Caching a `queryRecords` result (TTL, LRU, whatever) at the client
layer is attractive because it appears to fix latency without
touching storage design. It doesn't — it hides the fact that you
chose the wrong primitive. The first read per cache window still
pays the growing-with-history cost, and the design drifts further
from correct as the caching surface calcifies.
Cache is acceptable for genuinely external-service latency (price
oracles, Solana RPC balance checks). It is not acceptable as a
substitute for using the Naming API.
## Pagination (`limit` / `offset`)
`queryRecords` accepts `limit` and `offset`. Use them when you
genuinely want an append-only event stream with cursor-style
paging (e.g., "the last 100 events matching filter X"). Do NOT
use pagination to patch over "my mutable-identity query is slow" —
pagination doesn't reduce the per-entity version multiplier,
only the per-request fetch budget.
## Mutations go through Cosmos tx, not GraphQL
Writes (`setRecord`, `setName`, `deleteRecord`, `reserveAuthority`,
etc.) are Cosmos SDK `Msg`s, submitted via Tendermint RPC. The
GraphQL endpoint is read-only. Client libraries usually wrap this
behind a "registry writer" sidecar or equivalent — check how your
SDK / service is structured before wiring writes.
## Worked example: mutable game state
Game records a user's loot box that transitions
`PENDING → ACTIVE → RESOLVED`.
Correct shape:
- **Record writes** (one per state transition; each gets a new CID):
- `setRecord(type="LootBox", attributes={wallet, id, status}, data={...})`
- Preserves complete history as immutable records. Needed for
audit: "what did this box look like when the user picked?"
- **Name write** (on every transition, pointing at the new CID):
- `setName(name=f"mygame/lootboxes/{wallet}/{box_id}", cid=<new_cid>)`
- **Read: current state of one box**:
- `resolveName(f"mygame/lootboxes/{wallet}/{box_id}")` → one record.
- **Read: all current boxes for a wallet**:
- Enumerate names by prefix, resolve each. Or maintain an
index name `mygame/lootboxes/{wallet}/_index` whose pointed
record lists active box ids.
- **Read: audit trail of one box**:
- `lookupNames([f"mygame/lootboxes/{wallet}/{box_id}"])` returns
`NameRecord.history`, one entry per state transition, each with
its block height. Resolve each entry's CID to get the full
historical state.
This shape works with any number of boxes per wallet. Per-wallet
latency is bounded by active-box count, not lifetime
transition-count.
## When you find an existing codebase doing it wrong
Appending a fix on top of the wrong primitive is usually the
wrong move. Options in order of preference:
1. **Migrate to Naming API.** The data already on chain as Records
is still valid — you just need to start writing names and
reading via names. Existing consumers of the attribute-scan
path can keep working until they're moved.
2. **Document the debt.** If the migration is deferred, write an
ADR saying so and naming the specific class of regression
this creates (latency as history accumulates).
3. **Do NOT reach for caches or compaction first.** Those hide
the design mistake and make the eventual migration harder.
## See also
- `AGENTS.md` (repo root) — why AI agents particularly mis-pick
primitives here, and what context they need.
- `gql/cerc-io/laconicd/schema.graphql` — inline anti-pattern
warnings on `queryRecords` and `getRecordsByIds`.