From eb9aa008161f31819208dbd50506464cc1b4538d Mon Sep 17 00:00:00 2001
From: Prathamesh Musale <prathamesh.musale0@gmail.com>
Date: Fri, 24 Apr 2026 16:04:31 +0000
Subject: [PATCH] docs: guide agents/humans toward Naming API for mutable
 logical identity
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Applications built on laconicd repeatedly pick the Record + attribute-
scan pattern for data that has mutable logical identity (game state,
user profiles, inventories). The query cost then grows linearly with
per-entity version count, and the fix reached for (cache, pagination,
compaction) hides the design mistake rather than correcting it. AI
agents hit this failure mode especially reliably: they pattern-match
the existing client's queryRecords usage as canon and never discover
the Naming API.

This commit adds four pieces of documentation that surface the right
primitive *before* a wrong one is committed to:

  * docs/PATTERNS.md — primitive decision tree with concrete
    anti-patterns (queryRecords-as-KV, compaction-for-latency,
    cache-the-slow-query) and a worked example (mutable game state).
  * AGENTS.md — explicit for AI coding agents; names the failure mode,
    lists six rules, and tells agents to read PATTERNS.md before
    writing their first queryRecords call.
  * gql/cerc-io/laconicd/schema.graphql — prescriptive comments on
    queryRecords (warning against mutable-identity usage),
    getRecordsByIds (point-lookup clarification), and the Naming
    API section (USE THIS for current-state lookups).
  * README.md — new 'Designing state that lives on laconicd' section
    between Usage and Tests, linking both new docs.
---
 AGENTS.md                           | 127 ++++++++++++++++
 README.md                           |  20 +++
 docs/PATTERNS.md                    | 220 ++++++++++++++++++++++++++++
 gql/cerc-io/laconicd/schema.graphql |  41 +++++-
 4 files changed, 406 insertions(+), 2 deletions(-)
 create mode 100644 AGENTS.md
 create mode 100644 docs/PATTERNS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 000000000..66aa94673
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,127 @@
+# Agents: read this before touching laconicd-backed state
+
+Written for AI coding agents (Claude, Codex, Copilot, etc.) working
+on laconicd itself or on applications that persist data through
+it. Humans can read it too — the content is the same — but the
+tone addresses a specific failure mode that agents exhibit more
+consistently than humans.
+
+## The failure mode this document exists to prevent
+
+Agents extending an existing laconicd-backed codebase pattern-match
+the visible API calls and treat them as canonical. If a codebase
+uses `queryRecords` with attribute filtering for everything —
+including where it's wrong — an agent will extend that usage
+rather than discover the Naming API.
+
+When the wrong pattern eventually scales badly (latency grows with
+accumulated record versions), agents reach for **workarounds
+within the chosen pattern** — pagination, caches, compaction —
+rather than re-evaluating the primitive choice. That drift
+compounds: each workaround entrenches the wrong design.
+
+This has happened repeatedly on real projects. The architectural
+cost is high, because laconicd is append-only: wrong versions
+persist on chain as permanent garbage that you can't silently
+delete.
+
+## Concrete rules for agents
+
+### 1. Before designing any laconicd-backed state, read `docs/PATTERNS.md`
+
+Not after a first implementation. Not when you hit a scaling
+wall. **Before.** The decision tree there is the only reliable
+way to pick the right primitive.
+
+### 2. Do not pattern-match from existing client code as if it were canon
+
+A gateway, SDK, or service that already uses laconicd is an
+existing code sample. It is not evidence that the patterns it
+uses are correct. Check:
+
+- Does the client use `setName` / `lookupNames` / `resolveNames`
+  at all? If not, the chances it's modelling mutable identity
+  with `queryRecords`+attribute scan are high — that's almost
+  always a latent latency bug.
+- Does `save_*` in the client write only a Record, never a name?
+  If your new feature has the same shape, you're about to extend
+  the bug.
+- Does the client have "pick the latest" / "sort desc and take
+  first" logic after a `queryRecords` call? That is exactly what
+  `resolveName` does at the chain level — the client is
+  re-implementing it badly.
+
+### 3. Questions an agent should ask itself before writing `queryRecords`
+
+- "Am I looking up the current state of a logical entity?" If yes,
+  you want the Naming API, not `queryRecords`.
+- "Will this query return more records as the same entity gets
+  modified over time?" If yes, you want the Naming API.
+- "Am I about to write client-side code that picks the latest of
+  the returned list?" If yes, you want the Naming API.
+
+If any answer is yes and you still want `queryRecords`, stop and
+write a comment explaining why the Naming API doesn't fit. If you
+can't, the Naming API does fit.
+
+### 4. Do not propose caches, pagination, or compaction as the first fix for slow reads
+
+Slow `queryRecords` on a mutable-identity use case is a design
+smell, not a performance problem. The fixes agents reach for
+first are all wrong:
+
+- **TTL caches** hide the latency rather than fix it. First read
+  per window still pays the full cost.
+- **`limit` / `offset` pagination** caps per-request fetch cost
+  but doesn't reduce the version-per-entity multiplier.
+- **`delete_records` compaction** destroys the audit trail that
+  the immutable-Record design was supposed to provide.
+
+If you're reaching for these, you are almost certainly on the
+wrong side of the decision tree in `docs/PATTERNS.md`. Go back
+and re-pick the primitive.
+
+### 5. If you find a codebase already on the wrong side
+
+Do not compound the mistake. Options in order of preference:
+
+1. Migrate the specific feature you're touching to the Naming
+   API. You can usually do this incrementally — start writing
+   names on new writes, keep reading via attribute-scan until
+   the name index is populated, then flip the read path.
+2. Write an ADR documenting the debt and which class of
+   regression will appear (latency grows with history; first
+   timeout hit around ~N records).
+3. Ship the feature on the Naming API even if the rest of the
+   codebase doesn't — a single correct example in the codebase
+   is worth more than consistency with a wrong pattern.
+
+### 6. Read laconicd's schema and keeper source, not just the client
+
+Before proposing a fix that involves "laconicd doesn't support
+X":
+
+- `gql/cerc-io/laconicd/schema.graphql` — the actual read API.
+- `x/registry/keeper/` — the write and index path.
+- `proto/cerc/registry/v1/tx.proto` — the mutation messages.
+
+Many "laconicd limitations" agents assume are actually client
+limitations. `queryRecords` supports `limit` and `offset`;
+`setName` and `lookupNames` exist and work; `NameRecord.history`
+returns block-height-tagged prior bindings. Verify capabilities
+against source before recommending workarounds or upstream
+changes.
+
+## Why this is in its own file
+
+A `PATTERNS.md` on its own is a reference document humans browse.
+An `AGENTS.md` is loaded into context and read top-to-bottom at
+the start of agent sessions. Calling out the failure mode
+explicitly — "agents tend to overlook the Naming API" — is load-
+bearing because the people most likely to hit the failure are
+the ones who wouldn't naturally pattern-match to a document
+titled "Patterns."
+
+If you are reading this and you are about to write your first
+`queryRecords` call: stop. Read `docs/PATTERNS.md`. Then come
+back.
diff --git a/README.md b/README.md
index 279aca924..ff0ae48c0 100644
--- a/README.md
+++ b/README.md
@@ -29,6 +29,26 @@ Run with a single node fixture:
 
 See [lockup.md](./lockup.md) for lockup account usage.
 
+## Designing state that lives on laconicd
+
+**Before** writing a client library, service, or migration that
+persists data through laconicd, read:
+
+- [`docs/PATTERNS.md`](./docs/PATTERNS.md) — the primitive
+  decision tree. Explains when to use the Naming API
+  (`setName` / `lookupNames` / `resolveNames`) vs. the Record
+  API (`setRecord` / `queryRecords` / `getRecordsByIds`), with
+  worked examples and the common anti-patterns.
+- [`AGENTS.md`](./AGENTS.md) — written for AI coding agents, but
+  the failure mode it describes (extending the wrong primitive
+  rather than re-evaluating it) applies to human contributors
+  under deadline pressure too.
+
+Picking the wrong primitive is expensive to unwind: laconicd is
+append-only, so mistakes persist as on-chain data you cannot
+silently remove. Spend the 10 minutes to read the pattern guide
+before the first commit.
+
 ## Tests
 
 Run tests:
diff --git a/docs/PATTERNS.md b/docs/PATTERNS.md
new file mode 100644
index 000000000..6a44e2aa9
--- /dev/null
+++ b/docs/PATTERNS.md
@@ -0,0 +1,220 @@
+# Laconicd design patterns
+
+**Read this before designing any state that lives on laconicd.**
+Not after. Writing a client library, a service that persists user
+data, or a migration from another store — start here.
+
+This document exists because the wrong primitive choice in an
+application is expensive to unwind after production traffic
+accumulates. The chain is append-only; mistakes persist as
+on-chain garbage you cannot delete retroactively. Most of the
+architectural regret in the laconicd ecosystem so far has come
+from picking `queryRecords` as the default read path when a
+different primitive was the right tool.
+
+## The primitives, honestly
+
+Laconicd exposes three distinct storage and lookup primitives. They
+look similar on first glance and have VERY different semantics.
+
+| Primitive | Identity | Mutability | Lookup cost | Audit trail |
+|---|---|---|---|---|
+| **Record** (`setRecord` / `queryRecords` / `getRecordsByIds`) | Content hash (CID) | Immutable. Every change is a NEW record with a new CID. | Attribute-indexed, but returns EVERY version matching | Implicit — all versions persist forever |
+| **Name** (`setName` / `lookupNames` / `resolveNames`) | Caller-chosen string (`"mtm/lootboxes/<wallet>/<id>"`) | Pointer. Re-setting updates which CID the name resolves to. | Direct name → latest CID lookup (single record) | Explicit — `NameRecord.history` returns prior bindings with block heights |
+| **Authority / Bond / Auction** | Purpose-specific | Varies | Purpose-specific | Varies |
+
+**The trap:** `Record` looks like the universal "put this thing on
+chain" primitive because its API surface is biggest and most
+obvious. If you treat it as such, every mutable-state use case
+(game state, user profiles, inventories, counters) ends up re-
+implementing the Naming API badly — appending version after
+version, then filtering client-side for "the latest."
+
+## Decision tree
+
+```
+Does the thing you're storing have a mutable logical identity?
+(i.e., "the current state of thing X" is a meaningful question)
+
+├── YES → Use the Naming API.
+│         setName("mtm/lootboxes/<wallet>/<box_id>", cid) on every write.
+│         lookupNames(...) or resolveNames(...) returns the one
+│         current record. Audit trail via NameRecord.history.
+│
+└── NO → Is it a point-lookup by CID?
+          │
+          ├── YES → getRecordsByIds([cid])
+          │
+          └── NO → It's an append-only event stream with
+                   queryable attributes. Use setRecord +
+                   queryRecords. Expect every historical record
+                   to come back; plan pagination accordingly.
+```
+
+## Anti-pattern: "queryRecords-as-key-value-store"
+
+This is the mistake real laconicd codebases keep making. It looks
+like this:
+
+```python
+# BAD — reimplementing names via attribute scan
+
+async def save_state(user_id: str, state: dict):
+    await set_record(
+        record_type="UserState",
+        attributes={"user_id": user_id},        # logical ID as attribute
+        data=state,
+    )
+
+async def get_state(user_id: str) -> dict:
+    records = await query_records(
+        record_type="UserState",
+        attributes={"user_id": user_id},
+    )
+    # Client-side "pick the latest" — smell
+    records.sort(key=lambda r: r["created_at"], reverse=True)
+    return records[0]
+```
+
+Why it's wrong:
+
+1. Every call to `save_state` writes a NEW content-addressed record.
+   Old records stay on chain forever.
+2. `query_records` returns all of them every time. Per-record fetch
+   cost scales linearly with save count.
+3. The client-side `sort + [0]` is "I wanted the latest" — exactly
+   what `lookupNames` does at the index level.
+4. Latency degrades silently as users accumulate history. The
+   query eventually breaches whatever timeout the caller has.
+
+The correct pattern:
+
+```python
+# GOOD — names for mutable identity
+
+async def save_state(user_id: str, state: dict):
+    cid = await set_record(
+        record_type="UserState",
+        attributes={"user_id": user_id},
+        data=state,
+    )
+    await set_name(
+        name=f"my-app/user-state/{user_id}",
+        cid=cid,
+    )
+
+async def get_state(user_id: str) -> dict:
+    record = await resolve_name(f"my-app/user-state/{user_id}")
+    return record
+```
+
+Properties of the correct pattern:
+
+- One name → one current CID. Latency doesn't scale with save count.
+- Historical versions still exist on chain (immutable Record CIDs
+  aren't deleted), and are reachable via `NameRecord.history`.
+- Audit trail is explicit rather than implicit — you can ask "what
+  was the state at block N?" rather than scanning every version.
+- Client code has no "pick the latest" step; the chain already
+  answered that.
+
+## Anti-pattern: "compaction via `delete_records`"
+
+The `delete_records` primitive exists and is tempting to use for
+"prune old versions of logical entity X to keep queries fast." Do
+not use it for that unless you have signed off on dropping the
+audit trail.
+
+- `delete_records` physically removes a CID from the index.
+- If your design relies on every version being recoverable (audit
+  trails, chain-of-custody, dispute resolution), compaction
+  destroys exactly that property.
+- If your design doesn't need every version, you were never in
+  the Record + attribute-scan pattern to begin with — you wanted
+  the Naming API from the start.
+
+`delete_records` is appropriate for **user-intent-driven
+permanent deletion** (unregistering a device, unfollowing,
+revoking a credential). Not for database-compaction-style
+version pruning.
+
+## Anti-pattern: "cache the slow query"
+
+Caching a `queryRecords` result (TTL, LRU, whatever) at the client
+layer is attractive because it appears to fix latency without
+touching storage design. It doesn't — it hides the fact that you
+chose the wrong primitive. The first read per cache window still
+pays the growing-with-history cost, and the design drifts further
+from correct as the caching surface calcifies.
+
+Cache is acceptable for genuinely external-service latency (price
+oracles, Solana RPC balance checks). It is not acceptable as a
+substitute for using the Naming API.
+
+## Pagination (`limit` / `offset`)
+
+`queryRecords` accepts `limit` and `offset`. Use them when you
+genuinely want an append-only event stream with cursor-style
+paging (e.g., "the last 100 events matching filter X"). Do NOT
+use pagination to patch over "my mutable-identity query is slow" —
+pagination doesn't reduce the per-entity version multiplier,
+only the per-request fetch budget.
+
+## Mutations go through Cosmos tx, not GraphQL
+
+Writes (`setRecord`, `setName`, `deleteRecord`, `reserveAuthority`,
+etc.) are Cosmos SDK `Msg`s, submitted via Tendermint RPC. The
+GraphQL endpoint is read-only. Client libraries usually wrap this
+behind a "registry writer" sidecar or equivalent — check how your
+SDK / service is structured before wiring writes.
+
+## Worked example: mutable game state
+
+Game records a user's loot box that transitions
+`PENDING → ACTIVE → RESOLVED`.
+
+Correct shape:
+
+- **Record writes** (one per state transition; each gets a new CID):
+  - `setRecord(type="LootBox", attributes={wallet, id, status}, data={...})`
+  - Preserves complete history as immutable records. Needed for
+    audit: "what did this box look like when the user picked?"
+- **Name write** (on every transition, pointing at the new CID):
+  - `setName(name=f"mygame/lootboxes/{wallet}/{box_id}", cid=<new_cid>)`
+- **Read: current state of one box**:
+  - `resolveName(f"mygame/lootboxes/{wallet}/{box_id}")` → one record.
+- **Read: all current boxes for a wallet**:
+  - Enumerate names by prefix, resolve each. Or maintain an
+    index name `mygame/lootboxes/{wallet}/_index` whose pointed
+    record lists active box ids.
+- **Read: audit trail of one box**:
+  - `lookupNames([f"mygame/lootboxes/{wallet}/{box_id}"])` returns
+    `NameRecord.history`, one entry per state transition, each with
+    its block height. Resolve each entry's CID to get the full
+    historical state.
+
+This shape works with any number of boxes per wallet. Per-wallet
+latency is bounded by active-box count, not lifetime
+transition-count.
+
+## When you find an existing codebase doing it wrong
+
+Appending a fix on top of the wrong primitive is usually the
+wrong move. Options in order of preference:
+
+1. **Migrate to Naming API.** The data already on chain as Records
+   is still valid — you just need to start writing names and
+   reading via names. Existing consumers of the attribute-scan
+   path can keep working until they're moved.
+2. **Document the debt.** If the migration is deferred, write an
+   ADR saying so and naming the specific class of regression
+   this creates (latency as history accumulates).
+3. **Do NOT reach for caches or compaction first.** Those hide
+   the design mistake and make the eventual migration harder.
+
+## See also
+
+- `AGENTS.md` (repo root) — why AI agents particularly mis-pick
+  primitives here, and what context they need.
+- `gql/cerc-io/laconicd/schema.graphql` — inline anti-pattern
+  warnings on `queryRecords` and `getRecordsByIds`.
diff --git a/gql/cerc-io/laconicd/schema.graphql b/gql/cerc-io/laconicd/schema.graphql
index a58159c4f..6a5987109 100644
--- a/gql/cerc-io/laconicd/schema.graphql
+++ b/gql/cerc-io/laconicd/schema.graphql
@@ -240,10 +240,30 @@ type Query {
   # GraphDB API.
   #
 
-  # Get records by IDs.
+  # Point-lookup of records by content-hash CID.
+  #
+  # Returns one record per id. CIDs are immutable — a record's
+  # content never changes, so this API answers "what was at this
+  # CID?" If you want "what is the current state of logical
+  # thing X?", use resolveNames / lookupNames instead (below).
   getRecordsByIds(ids: [String!]): [Record]
 
-  # Query records.
+  # Attribute-filtered query over records.
+  #
+  # WARNING — pick the right primitive before using this.
+  # Records are content-addressed and immutable; every state
+  # change writes a NEW record with a new CID. queryRecords
+  # returns EVERY version matching the filter, not "the latest."
+  # If you find yourself filtering by a logical id attribute
+  # (wallet, user_id, entity_id, etc.) and then picking the
+  # "latest" version client-side, you are reimplementing the
+  # Naming API (resolveNames / lookupNames) at the client layer.
+  # That is the correct primitive for mutable logical identity.
+  #
+  # Use queryRecords for: append-only event streams, list-all-
+  # events-matching-filter, cursor-paged audit scans.
+  # DO NOT use queryRecords for: "get the current state of X".
+  # See docs/PATTERNS.md and AGENTS.md for the decision tree.
   queryRecords(
     # Multiple attribute conditions are in a logical AND.
     attributes: [KeyValueInput!]
@@ -261,6 +281,17 @@ type Query {
   #
   # Naming API.
   #
+  # USE THIS for any "current state of logical thing X" lookup.
+  # setName binds a caller-chosen name to a content-addressed
+  # record CID. Re-calling setName with a new CID updates which
+  # CID the name resolves to, without losing the prior binding
+  # (NameRecord.history preserves it with block heights).
+  #
+  # Names eliminate the "scan all versions, pick latest" client-
+  # side pattern that queryRecords invites. They are the
+  # correct primitive for mutable logical identity.
+  # See docs/PATTERNS.md for the decision tree and worked examples.
+  #
 
   # Get authorities list.
   getAuthorities(owner: String): [Authority]!
@@ -269,6 +300,12 @@ type Query {
   lookupAuthorities(names: [String!]): [AuthorityRecord]!
 
   # Lookup name to record mapping information.
+  #
+  # Returns a NameRecord per queried name, each containing the
+  # `latest` binding (the current CID) and `history` (all prior
+  # bindings with their block heights). Use this when you want
+  # both the current state and its audit trail. If you only want
+  # the current Record, resolveNames is more direct.
   lookupNames(names: [String!]): [NameRecord]!
 
   # Resolve names to records.