From eb9aa008161f31819208dbd50506464cc1b4538d Mon Sep 17 00:00:00 2001 From: Prathamesh Musale Date: Fri, 24 Apr 2026 16:04:31 +0000 Subject: [PATCH] docs: guide agents/humans toward Naming API for mutable logical identity MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Applications built on laconicd repeatedly pick the Record + attribute- scan pattern for data that has mutable logical identity (game state, user profiles, inventories). The query cost then grows linearly with per-entity version count, and the fix reached for (cache, pagination, compaction) hides the design mistake rather than correcting it. AI agents hit this failure mode especially reliably: they pattern-match the existing client's queryRecords usage as canon and never discover the Naming API. This commit adds four pieces of documentation that surface the right primitive *before* a wrong one is committed to: * docs/PATTERNS.md — primitive decision tree with concrete anti-patterns (queryRecords-as-KV, compaction-for-latency, cache-the-slow-query) and a worked example (mutable game state). * AGENTS.md — explicit for AI coding agents; names the failure mode, lists six rules, and tells agents to read PATTERNS.md before writing their first queryRecords call. * gql/cerc-io/laconicd/schema.graphql — prescriptive comments on queryRecords (warning against mutable-identity usage), getRecordsByIds (point-lookup clarification), and the Naming API section (USE THIS for current-state lookups). * README.md — new 'Designing state that lives on laconicd' section between Usage and Tests, linking both new docs. --- AGENTS.md | 127 ++++++++++++++++ README.md | 20 +++ docs/PATTERNS.md | 220 ++++++++++++++++++++++++++++ gql/cerc-io/laconicd/schema.graphql | 41 +++++- 4 files changed, 406 insertions(+), 2 deletions(-) create mode 100644 AGENTS.md create mode 100644 docs/PATTERNS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..66aa94673 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,127 @@ +# Agents: read this before touching laconicd-backed state + +Written for AI coding agents (Claude, Codex, Copilot, etc.) working +on laconicd itself or on applications that persist data through +it. Humans can read it too — the content is the same — but the +tone addresses a specific failure mode that agents exhibit more +consistently than humans. + +## The failure mode this document exists to prevent + +Agents extending an existing laconicd-backed codebase pattern-match +the visible API calls and treat them as canonical. If a codebase +uses `queryRecords` with attribute filtering for everything — +including where it's wrong — an agent will extend that usage +rather than discover the Naming API. + +When the wrong pattern eventually scales badly (latency grows with +accumulated record versions), agents reach for **workarounds +within the chosen pattern** — pagination, caches, compaction — +rather than re-evaluating the primitive choice. That drift +compounds: each workaround entrenches the wrong design. + +This has happened repeatedly on real projects. The architectural +cost is high, because laconicd is append-only: wrong versions +persist on chain as permanent garbage that you can't silently +delete. + +## Concrete rules for agents + +### 1. Before designing any laconicd-backed state, read `docs/PATTERNS.md` + +Not after a first implementation. Not when you hit a scaling +wall. **Before.** The decision tree there is the only reliable +way to pick the right primitive. + +### 2. Do not pattern-match from existing client code as if it were canon + +A gateway, SDK, or service that already uses laconicd is an +existing code sample. It is not evidence that the patterns it +uses are correct. Check: + +- Does the client use `setName` / `lookupNames` / `resolveNames` + at all? If not, the chances it's modelling mutable identity + with `queryRecords`+attribute scan are high — that's almost + always a latent latency bug. +- Does `save_*` in the client write only a Record, never a name? + If your new feature has the same shape, you're about to extend + the bug. +- Does the client have "pick the latest" / "sort desc and take + first" logic after a `queryRecords` call? That is exactly what + `resolveName` does at the chain level — the client is + re-implementing it badly. + +### 3. Questions an agent should ask itself before writing `queryRecords` + +- "Am I looking up the current state of a logical entity?" If yes, + you want the Naming API, not `queryRecords`. +- "Will this query return more records as the same entity gets + modified over time?" If yes, you want the Naming API. +- "Am I about to write client-side code that picks the latest of + the returned list?" If yes, you want the Naming API. + +If any answer is yes and you still want `queryRecords`, stop and +write a comment explaining why the Naming API doesn't fit. If you +can't, the Naming API does fit. + +### 4. Do not propose caches, pagination, or compaction as the first fix for slow reads + +Slow `queryRecords` on a mutable-identity use case is a design +smell, not a performance problem. The fixes agents reach for +first are all wrong: + +- **TTL caches** hide the latency rather than fix it. First read + per window still pays the full cost. +- **`limit` / `offset` pagination** caps per-request fetch cost + but doesn't reduce the version-per-entity multiplier. +- **`delete_records` compaction** destroys the audit trail that + the immutable-Record design was supposed to provide. + +If you're reaching for these, you are almost certainly on the +wrong side of the decision tree in `docs/PATTERNS.md`. Go back +and re-pick the primitive. + +### 5. If you find a codebase already on the wrong side + +Do not compound the mistake. Options in order of preference: + +1. Migrate the specific feature you're touching to the Naming + API. You can usually do this incrementally — start writing + names on new writes, keep reading via attribute-scan until + the name index is populated, then flip the read path. +2. Write an ADR documenting the debt and which class of + regression will appear (latency grows with history; first + timeout hit around ~N records). +3. Ship the feature on the Naming API even if the rest of the + codebase doesn't — a single correct example in the codebase + is worth more than consistency with a wrong pattern. + +### 6. Read laconicd's schema and keeper source, not just the client + +Before proposing a fix that involves "laconicd doesn't support +X": + +- `gql/cerc-io/laconicd/schema.graphql` — the actual read API. +- `x/registry/keeper/` — the write and index path. +- `proto/cerc/registry/v1/tx.proto` — the mutation messages. + +Many "laconicd limitations" agents assume are actually client +limitations. `queryRecords` supports `limit` and `offset`; +`setName` and `lookupNames` exist and work; `NameRecord.history` +returns block-height-tagged prior bindings. Verify capabilities +against source before recommending workarounds or upstream +changes. + +## Why this is in its own file + +A `PATTERNS.md` on its own is a reference document humans browse. +An `AGENTS.md` is loaded into context and read top-to-bottom at +the start of agent sessions. Calling out the failure mode +explicitly — "agents tend to overlook the Naming API" — is load- +bearing because the people most likely to hit the failure are +the ones who wouldn't naturally pattern-match to a document +titled "Patterns." + +If you are reading this and you are about to write your first +`queryRecords` call: stop. Read `docs/PATTERNS.md`. Then come +back. diff --git a/README.md b/README.md index 279aca924..ff0ae48c0 100644 --- a/README.md +++ b/README.md @@ -29,6 +29,26 @@ Run with a single node fixture: See [lockup.md](./lockup.md) for lockup account usage. +## Designing state that lives on laconicd + +**Before** writing a client library, service, or migration that +persists data through laconicd, read: + +- [`docs/PATTERNS.md`](./docs/PATTERNS.md) — the primitive + decision tree. Explains when to use the Naming API + (`setName` / `lookupNames` / `resolveNames`) vs. the Record + API (`setRecord` / `queryRecords` / `getRecordsByIds`), with + worked examples and the common anti-patterns. +- [`AGENTS.md`](./AGENTS.md) — written for AI coding agents, but + the failure mode it describes (extending the wrong primitive + rather than re-evaluating it) applies to human contributors + under deadline pressure too. + +Picking the wrong primitive is expensive to unwind: laconicd is +append-only, so mistakes persist as on-chain data you cannot +silently remove. Spend the 10 minutes to read the pattern guide +before the first commit. + ## Tests Run tests: diff --git a/docs/PATTERNS.md b/docs/PATTERNS.md new file mode 100644 index 000000000..6a44e2aa9 --- /dev/null +++ b/docs/PATTERNS.md @@ -0,0 +1,220 @@ +# Laconicd design patterns + +**Read this before designing any state that lives on laconicd.** +Not after. Writing a client library, a service that persists user +data, or a migration from another store — start here. + +This document exists because the wrong primitive choice in an +application is expensive to unwind after production traffic +accumulates. The chain is append-only; mistakes persist as +on-chain garbage you cannot delete retroactively. Most of the +architectural regret in the laconicd ecosystem so far has come +from picking `queryRecords` as the default read path when a +different primitive was the right tool. + +## The primitives, honestly + +Laconicd exposes three distinct storage and lookup primitives. They +look similar on first glance and have VERY different semantics. + +| Primitive | Identity | Mutability | Lookup cost | Audit trail | +|---|---|---|---|---| +| **Record** (`setRecord` / `queryRecords` / `getRecordsByIds`) | Content hash (CID) | Immutable. Every change is a NEW record with a new CID. | Attribute-indexed, but returns EVERY version matching | Implicit — all versions persist forever | +| **Name** (`setName` / `lookupNames` / `resolveNames`) | Caller-chosen string (`"mtm/lootboxes//"`) | Pointer. Re-setting updates which CID the name resolves to. | Direct name → latest CID lookup (single record) | Explicit — `NameRecord.history` returns prior bindings with block heights | +| **Authority / Bond / Auction** | Purpose-specific | Varies | Purpose-specific | Varies | + +**The trap:** `Record` looks like the universal "put this thing on +chain" primitive because its API surface is biggest and most +obvious. If you treat it as such, every mutable-state use case +(game state, user profiles, inventories, counters) ends up re- +implementing the Naming API badly — appending version after +version, then filtering client-side for "the latest." + +## Decision tree + +``` +Does the thing you're storing have a mutable logical identity? +(i.e., "the current state of thing X" is a meaningful question) + +├── YES → Use the Naming API. +│ setName("mtm/lootboxes//", cid) on every write. +│ lookupNames(...) or resolveNames(...) returns the one +│ current record. Audit trail via NameRecord.history. +│ +└── NO → Is it a point-lookup by CID? + │ + ├── YES → getRecordsByIds([cid]) + │ + └── NO → It's an append-only event stream with + queryable attributes. Use setRecord + + queryRecords. Expect every historical record + to come back; plan pagination accordingly. +``` + +## Anti-pattern: "queryRecords-as-key-value-store" + +This is the mistake real laconicd codebases keep making. It looks +like this: + +```python +# BAD — reimplementing names via attribute scan + +async def save_state(user_id: str, state: dict): + await set_record( + record_type="UserState", + attributes={"user_id": user_id}, # logical ID as attribute + data=state, + ) + +async def get_state(user_id: str) -> dict: + records = await query_records( + record_type="UserState", + attributes={"user_id": user_id}, + ) + # Client-side "pick the latest" — smell + records.sort(key=lambda r: r["created_at"], reverse=True) + return records[0] +``` + +Why it's wrong: + +1. Every call to `save_state` writes a NEW content-addressed record. + Old records stay on chain forever. +2. `query_records` returns all of them every time. Per-record fetch + cost scales linearly with save count. +3. The client-side `sort + [0]` is "I wanted the latest" — exactly + what `lookupNames` does at the index level. +4. Latency degrades silently as users accumulate history. The + query eventually breaches whatever timeout the caller has. + +The correct pattern: + +```python +# GOOD — names for mutable identity + +async def save_state(user_id: str, state: dict): + cid = await set_record( + record_type="UserState", + attributes={"user_id": user_id}, + data=state, + ) + await set_name( + name=f"my-app/user-state/{user_id}", + cid=cid, + ) + +async def get_state(user_id: str) -> dict: + record = await resolve_name(f"my-app/user-state/{user_id}") + return record +``` + +Properties of the correct pattern: + +- One name → one current CID. Latency doesn't scale with save count. +- Historical versions still exist on chain (immutable Record CIDs + aren't deleted), and are reachable via `NameRecord.history`. +- Audit trail is explicit rather than implicit — you can ask "what + was the state at block N?" rather than scanning every version. +- Client code has no "pick the latest" step; the chain already + answered that. + +## Anti-pattern: "compaction via `delete_records`" + +The `delete_records` primitive exists and is tempting to use for +"prune old versions of logical entity X to keep queries fast." Do +not use it for that unless you have signed off on dropping the +audit trail. + +- `delete_records` physically removes a CID from the index. +- If your design relies on every version being recoverable (audit + trails, chain-of-custody, dispute resolution), compaction + destroys exactly that property. +- If your design doesn't need every version, you were never in + the Record + attribute-scan pattern to begin with — you wanted + the Naming API from the start. + +`delete_records` is appropriate for **user-intent-driven +permanent deletion** (unregistering a device, unfollowing, +revoking a credential). Not for database-compaction-style +version pruning. + +## Anti-pattern: "cache the slow query" + +Caching a `queryRecords` result (TTL, LRU, whatever) at the client +layer is attractive because it appears to fix latency without +touching storage design. It doesn't — it hides the fact that you +chose the wrong primitive. The first read per cache window still +pays the growing-with-history cost, and the design drifts further +from correct as the caching surface calcifies. + +Cache is acceptable for genuinely external-service latency (price +oracles, Solana RPC balance checks). It is not acceptable as a +substitute for using the Naming API. + +## Pagination (`limit` / `offset`) + +`queryRecords` accepts `limit` and `offset`. Use them when you +genuinely want an append-only event stream with cursor-style +paging (e.g., "the last 100 events matching filter X"). Do NOT +use pagination to patch over "my mutable-identity query is slow" — +pagination doesn't reduce the per-entity version multiplier, +only the per-request fetch budget. + +## Mutations go through Cosmos tx, not GraphQL + +Writes (`setRecord`, `setName`, `deleteRecord`, `reserveAuthority`, +etc.) are Cosmos SDK `Msg`s, submitted via Tendermint RPC. The +GraphQL endpoint is read-only. Client libraries usually wrap this +behind a "registry writer" sidecar or equivalent — check how your +SDK / service is structured before wiring writes. + +## Worked example: mutable game state + +Game records a user's loot box that transitions +`PENDING → ACTIVE → RESOLVED`. + +Correct shape: + +- **Record writes** (one per state transition; each gets a new CID): + - `setRecord(type="LootBox", attributes={wallet, id, status}, data={...})` + - Preserves complete history as immutable records. Needed for + audit: "what did this box look like when the user picked?" +- **Name write** (on every transition, pointing at the new CID): + - `setName(name=f"mygame/lootboxes/{wallet}/{box_id}", cid=)` +- **Read: current state of one box**: + - `resolveName(f"mygame/lootboxes/{wallet}/{box_id}")` → one record. +- **Read: all current boxes for a wallet**: + - Enumerate names by prefix, resolve each. Or maintain an + index name `mygame/lootboxes/{wallet}/_index` whose pointed + record lists active box ids. +- **Read: audit trail of one box**: + - `lookupNames([f"mygame/lootboxes/{wallet}/{box_id}"])` returns + `NameRecord.history`, one entry per state transition, each with + its block height. Resolve each entry's CID to get the full + historical state. + +This shape works with any number of boxes per wallet. Per-wallet +latency is bounded by active-box count, not lifetime +transition-count. + +## When you find an existing codebase doing it wrong + +Appending a fix on top of the wrong primitive is usually the +wrong move. Options in order of preference: + +1. **Migrate to Naming API.** The data already on chain as Records + is still valid — you just need to start writing names and + reading via names. Existing consumers of the attribute-scan + path can keep working until they're moved. +2. **Document the debt.** If the migration is deferred, write an + ADR saying so and naming the specific class of regression + this creates (latency as history accumulates). +3. **Do NOT reach for caches or compaction first.** Those hide + the design mistake and make the eventual migration harder. + +## See also + +- `AGENTS.md` (repo root) — why AI agents particularly mis-pick + primitives here, and what context they need. +- `gql/cerc-io/laconicd/schema.graphql` — inline anti-pattern + warnings on `queryRecords` and `getRecordsByIds`. diff --git a/gql/cerc-io/laconicd/schema.graphql b/gql/cerc-io/laconicd/schema.graphql index a58159c4f..6a5987109 100644 --- a/gql/cerc-io/laconicd/schema.graphql +++ b/gql/cerc-io/laconicd/schema.graphql @@ -240,10 +240,30 @@ type Query { # GraphDB API. # - # Get records by IDs. + # Point-lookup of records by content-hash CID. + # + # Returns one record per id. CIDs are immutable — a record's + # content never changes, so this API answers "what was at this + # CID?" If you want "what is the current state of logical + # thing X?", use resolveNames / lookupNames instead (below). getRecordsByIds(ids: [String!]): [Record] - # Query records. + # Attribute-filtered query over records. + # + # WARNING — pick the right primitive before using this. + # Records are content-addressed and immutable; every state + # change writes a NEW record with a new CID. queryRecords + # returns EVERY version matching the filter, not "the latest." + # If you find yourself filtering by a logical id attribute + # (wallet, user_id, entity_id, etc.) and then picking the + # "latest" version client-side, you are reimplementing the + # Naming API (resolveNames / lookupNames) at the client layer. + # That is the correct primitive for mutable logical identity. + # + # Use queryRecords for: append-only event streams, list-all- + # events-matching-filter, cursor-paged audit scans. + # DO NOT use queryRecords for: "get the current state of X". + # See docs/PATTERNS.md and AGENTS.md for the decision tree. queryRecords( # Multiple attribute conditions are in a logical AND. attributes: [KeyValueInput!] @@ -261,6 +281,17 @@ type Query { # # Naming API. # + # USE THIS for any "current state of logical thing X" lookup. + # setName binds a caller-chosen name to a content-addressed + # record CID. Re-calling setName with a new CID updates which + # CID the name resolves to, without losing the prior binding + # (NameRecord.history preserves it with block heights). + # + # Names eliminate the "scan all versions, pick latest" client- + # side pattern that queryRecords invites. They are the + # correct primitive for mutable logical identity. + # See docs/PATTERNS.md for the decision tree and worked examples. + # # Get authorities list. getAuthorities(owner: String): [Authority]! @@ -269,6 +300,12 @@ type Query { lookupAuthorities(names: [String!]): [AuthorityRecord]! # Lookup name to record mapping information. + # + # Returns a NameRecord per queried name, each containing the + # `latest` binding (the current CID) and `history` (all prior + # bindings with their block heights). Use this when you want + # both the current state and its audit trail. If you only want + # the current Record, resolveNames is more direct. lookupNames(names: [String!]): [NameRecord]! # Resolve names to records.