Safety
A security audit on every AI change, not once a quarter
Security review used to be scheduled around how fast humans write code: an audit before the big release, a pentest once a year. An AI agent ships a feature an hour. The math stopped working, and the gap is where the incidents live.
What AI-generated code gets wrong
Agent code fails security review in predictable ways, and almost never in the obvious ones. The patterns we see on every audit:
- The happy path is authenticated; the error path leaks.
- Webhooks and callbacks answer before verifying who is calling.
- Secrets work their way into code or logs because that was the shortest path to making the demo run.
- Access checks happen in the UI, where the model saw them in training data, instead of on the server.
None of these are exotic. All of them pass a casual diff read, because the code looks like working code. They are exactly what a structured pass catches and a glance does not.
The audit moves into the loop
In Nanostack, /security is a phase of the sprint, not an event on a calendar. Every change gets an OWASP Top 10 and STRIDE pass, aware of the stack it is auditing: a payments webhook gets signature and replay scrutiny, a login form gets session and enumeration scrutiny. The result is graded A to F and saved as an artifact with the findings, each one tied to a file and line.
Two properties make this work in practice. The findings are fixed in the same sprint, while the agent still has full context, instead of aging in a backlog. And the ship gate reads the artifact: no security pass, no commit. The audit is not a virtue. It is a prerequisite.
Graded, so humans can triage
The grade exists for the person who cannot read the diff. "0 critical, 0 high, 1 low, grade A" is a sentence a founder, a PM, or a reviewer in a hurry can act on. The full findings are underneath for whoever wants them. Security reporting that only an expert can interpret does not change anyone's behavior.
One honest note: the audit is as strong as the model running it, and it runs locally with your agent. It replaces the nothing that was there between annual audits. It does not replace a professional assessment for high-stakes systems, and we say so in the docs.
How /security works → · The other half: blocking dangerous commands →