Workflow

Your AI agent needs a workflow, not better prompts

June 9, 2026 · 7 min

AI coding agents are good at writing code and bad at deciding what code to write. Most of the frustration people blame on the model is a missing workflow. This piece makes that claim concrete: what the artifacts actually contain, how they are validated, and why a hash can do a job a prompt cannot.

The failure is architectural, not behavioral

You correct the agent, it agrees, and forty minutes later the correction is gone. That is not the model being sloppy. The agent's only memory is the conversation, and conversations are lossy by design: when the context window fills, the host compacts it into a summary, and your correction is exactly the kind of small, old detail summaries lose. Prompting harder does not fix this. The instruction you reinforce today gets compacted tomorrow.

The fix is to stop storing decisions in the lossy medium. In a Nanostack sprint, every phase writes its conclusions to a JSON artifact on disk, and the next phase reads the file, not the chat. Scope survives because the brief is a file. The plan survives because the plan is a file. Your correction survives the moment it lands in one of them.

What an artifact actually is

Artifacts are not free-form notes. Every one shares a base shape:

{
  "schema_version": "1",
  "phase": "review",
  "timestamp": "2026-06-04T14:27:00Z",
  "project": "/path/to/repo",
  "branch": "feat/stripe-checkout",
  "mode": "standard",
  "summary": { ... },
  "findings": [ ... ],
  "integrity": "<sha256 of the artifact, integrity stripped>"
}

And each phase has a contract on top. A plan must carry summary.planned_files and a plan_approval. A review must carry scope_drift with a status, plus its findings. The save script validates against these contracts before writing anything; missing fields go to stderr and the save exits 1. An agent cannot hand the next phase a vibes-only artifact, because the write path refuses to produce one.

Trust is a computed property

The integrity field is where this stops being note-taking and becomes infrastructure. The hash is computed over the canonical JSON with the integrity field stripped, and every reader resolves one of four statuses: verified, integrity_missing, integrity_mismatch, or not_found.

The subtle one is integrity_missing. A naive design treats a missing hash as "old file, probably fine". But anyone who can modify the file can also delete the hash, so release gates treat missing exactly like mismatched: untrusted. Consumers that gate releases call the resolver with --require-integrity, which fails closed on both. That one flag is the difference between "the agent says QA passed" and "a QA artifact exists, has the required fields, and has not been touched since it was written".

The gate, concretely

With trusted artifacts in place, the phase gate becomes almost boring. Before a commit lands, a hook checks: does a fresh review artifact exist for this work, and security, and QA, and does each verify? If not:

$ git commit -m "stripe checkout + webhook"
BLOCKED [PHASE-GATE] artifacts required: review, security, qa

Notice the gate never re-reviews the code and never asks the model anything. All the intelligence already ran in the phases. The gate only enforces that the records exist and verify, which is precisely the kind of check that should not involve a probabilistic system. On hosts with hook support (Claude Code today) this physically blocks the commit; on other verified agents the same gate runs as guided instructions, and each adapter file in the repo states which behavior you get.

Why prompts cannot do this job

A prompt and a workflow fail differently. "Always review before committing" is advice inside the context window: subject to compaction, reinterpretation, and the model's mood under pressure. The review-before-commit invariant in a workflow is not in the context window at all. It lives in a hook and a hash check that run whether or not the model remembers agreeing to them.

That is the entire bet, stated once: move every invariant you cannot afford to lose out of the conversation and into files, validators, and hooks. Let the model be brilliant inside the structure. Never make the structure depend on the model's memory.

Try it on something small

Nanostack is open source, Apache 2.0, fully local, and works with Claude Code, Cursor, OpenAI Codex, OpenCode, and Gemini CLI:

npx create-nanostack

Describe what you want and type /think. The first thing the agent does is ask you a question, and the first artifact lands in .nanostack/ a few minutes later. Open it. It is just JSON. That readability is the point.

← All content · Install nanostack →