All Insights

If Your Team Runs CrewAI Agents, You Have a Remote Code Execution Problem

CivSafe Team·April 4, 2026·6 min read

Four days ago, CERT published VU#221883. If you've been building AI agent workflows with CrewAI — or if anyone on your team has been experimenting with it — this is worth stopping for.

The short version: a researcher found four vulnerabilities in CrewAI that can be chained together. The result is remote code execution on the machine running the agent. Not a theoretical risk. A working attack path. No complete patch has shipped yet.

What CrewAI is and why it matters for small teams

CrewAI is one of the most popular open-source AI agent frameworks. It's the tool a lot of small teams reach for when they want to go beyond a simple chatbot — when you want multiple AI agents working together on a task, one agent researching, another summarizing, a third writing a report. It's well-documented, has a big community, and is genuinely useful for the kind of workflow automation that saves a 10-person team hours per week.

If your org has done any AI agent work in the past year, CrewAI has probably come up. GitHub puts it at well over a million downloads. It's not a fringe tool.

What was actually found

CERT's disclosure covers four CVEs:

CVE-2026-2275 — CrewAI's Code Interpreter tool falls back to an insecure Python sandbox (SandboxPython) when it can't reach Docker. That fallback mode enables arbitrary code execution through C function calls. If your agent config has allow_code_execution=True, or if a developer added the Code Interpreter tool manually, this fallback is live.

CVE-2026-2287 — Even if Docker is available at startup, CrewAI doesn't continuously verify it's still running. If Docker goes offline mid-session, the system silently switches to the insecure sandbox — no warning, no error. The agent keeps running. Now in dangerous mode.

CVE-2026-2286 — A server-side request forgery (SSRF) vulnerability in the RAG search tools. The tools don't validate URLs at runtime, so a maliciously crafted URL can pull content from internal services or cloud metadata endpoints that the machine has access to. AWS credential endpoints are a common target here.

CVE-2026-2285 — The JSON loader tool reads files without path validation. An attacker can point it at any file on the host system.

The attack chain: an attacker embeds malicious instructions in a document, web page, or data source your agent reads. The agent processes it (prompt injection). Those instructions direct the agent to use the Code Interpreter tool, trigger the Docker fallback, and execute arbitrary code on your machine. From there: your files, your credentials, your cloud access.

SecurityWeek summarized the core problem clearly: the attack chain takes milliseconds once the malicious content is processed.

Why this hits small orgs harder

Large orgs running CrewAI in production typically have it isolated — containerized, with restricted network access, behind authentication. The machine running the agent doesn't have production credentials on it.

Small teams doing AI automation don't usually have that setup. The CrewAI workflow is running on a developer's laptop. Or a shared VM that someone uses for other things. Or a cloud instance that also has AWS keys in the environment. That's the realistic picture for a 5-15 person team experimenting with agents.

When the Code Interpreter tool is running on that machine and an attacker can trigger it via a document your agent reads, the blast radius is your whole environment — not a sandboxed container with nothing interesting in it.

The other piece: a lot of small-team AI agent work involves processing external content. Your agent summarizes emails. It reads uploaded PDFs. It scrapes competitor websites. It processes form submissions. Every one of those is an input that could carry a prompt injection payload. If you're not sanitizing those inputs — and most teams aren't, because it's not obvious that you need to — you're exposed.

What to do right now

No complete patch is available yet. CrewAI's maintainers are working on mitigations. Until a proper fix ships, here's the practical short list:

Remove or disable the Code Interpreter tool if you're not actively using it. This is the highest-risk component. If your agent workflow doesn't specifically need to execute Python code, take it out of the agent's tool list. The attack chain requires it.

Don't use allow_code_execution=True unless you have a specific need and a sandboxed environment. This flag was added as a convenience. Right now it's a liability.

Keep Docker running if you use the Code Interpreter. CVE-2026-2287 exploits the Docker offline fallback. An agent that can verify Docker is running at all times is safer than one that silently degrades. Monitor your Docker process. Don't let it go down mid-session.

Treat every external input as potentially hostile. If your agent reads documents, web pages, or user-submitted content, that content can carry instructions. Add a step before processing: strip instructions from untrusted content, or run it through a separate sanitization agent that can't invoke tools. It's not a perfect solution but it significantly raises the bar.

Audit what your agents have access to. If your CrewAI instance is running with cloud credentials in the environment, database passwords visible, or SSH keys accessible — move those out. The principle is the same as any server: don't run processes with more access than they need.

The broader shift happening here

This isn't just a CrewAI problem. It's a preview of what happens when AI agents become general-purpose enough to be genuinely useful: they need to read things from the outside world, and the things they read can carry instructions.

The phrase researchers use is "indirect prompt injection" — instead of you giving the AI a malicious instruction, an attacker plants instructions in content the AI will eventually process. It's a new attack surface that didn't exist before AI started reading documents and browsing web pages on your behalf.

OpenClaw had similar issues hit at massive scale in March. CrewAI is now. This vector is going to show up in every popular agent framework as researchers look for it. The ones that survive will be the ones that build sanitization in, not on.

For small teams, the action item isn't to stop using AI agents — they're too useful. It's to get your agent environments sandboxed properly and build the habit of treating external inputs as hostile before you connect agents to anything sensitive.


We've been setting up and auditing AI agent workflows for small orgs for over a year now. This stuff moves fast and the security surface changes every few weeks. If you want a second set of eyes on your setup before something like this becomes a problem, reach out.

CivSafe — Strategic Innovation. Community Impact.