Safety

Agentic security: the attack surface you installed last month

May 13, 2026 · 5 min

Application security has a well-mapped territory: the code, the dependencies, the infrastructure. An AI coding agent fits none of those boxes. It is not code you wrote, not a dependency you pinned, not a server you hardened. It is software that acts, on your machine, with your credentials, thousands of times a day.

What changed

A traditional tool does what you invoke it to do. An agent decides what to do, then does it: runs shell commands, writes files, reads whatever is on disk, calls the network. Every one of those is a capability you granted the moment you installed it, and the decision about how to use them is made by a model, influenced by everything the model reads.

That is a new layer in the stack, the acting layer, and almost nobody's security program covers it. The code review covers the code. The dependency scanner covers the packages. Nothing covers "the agent decided to run this command because a file suggested it".

The failure modes are already familiar

Agentic incidents look like ordinary incidents with a strange cause. A wiped directory, but the operator was an agent cleaning up. Leaked credentials, but the leak was an agent pasting an env file into a log. Rewritten git history, but the force push came from an agent unblocking itself. The damage is classic. The actor is new, tireless, and very fast.

Defense in depth for the acting layer

The controls that work are the boring ones, applied to actions instead of packets. This is how Nanostack layers them:

Action control: Guard evaluates every shell command before execution. Block rules, mass deletion, history destruction, pipe-to-shell, run before the allowlist, so a safe binary with a dangerous argument still stops.
Data control: a separate hook on file writes protects credential files and system secret directories, resolving symlinks first. Artifacts get secret scanning, so evidence files do not become leak vectors.
Process control: read-only phases physically block writes. An agent reviewing code cannot also modify it, which removes a whole class of mid-review tampering.
Audit: every phase saves an artifact with a SHA-256 integrity hash. When you ask what the agent did last Tuesday, there is a record, and you can tell whether it was edited.

No single layer is the answer. The stack is the answer: an instruction that slips past the model still meets the action control, and an action that slips past everything still leaves the audit trail.

Why this matters now and not eventually

Agent adoption did not wait for agent security. Teams that would never run an unreviewed cron job as themselves are running agents with broader access and no controls at all, because the agents arrived inside developer tools, pre-trusted. The window where this is cheap to fix is while your agent fleet is one or two, not twenty. The controls above are open source and take a minute to install. The incident report takes longer.

The input side: prompt injection → · See Guard in action →

← All content · Install nanostack →