CivSafe — Strategic Innovation. Community Impact.

Last week, a repository called Open-OSS/privacy-filter hit the number one spot on Hugging Face's trending list. It racked up 244,000 downloads and 667 likes in under 18 hours. It looked like an official OpenAI release. It was meticulously presented — the model card was copied word-for-word from the real OpenAI repo it was impersonating.

It was malware.

Security researchers at HiddenLayer found the malicious code on May 7th. By the time Hugging Face pulled it down, a quarter million people had already cloned it. If you or anyone on your team downloaded an AI model from Hugging Face last week without checking the publisher, this affects you.

Here's exactly how it worked

The real repository is openai/privacy-filter. The fake was Open-OSS/privacy-filter. One character different in the namespace. Same model card. Same description. Same apparent legitimacy.

When you cloned the repo and ran loader.py — or even start.bat, which some readme instructions suggested — you triggered a chain reaction:

A decoy script ran first, looking like a legitimate model loader
A hidden function called _verify_checksum_integrity() disabled SSL certificate verification, then decoded a base64-encoded URL
That URL fetched a JSON document from a throwaway hosting service
The cmd field in that JSON was passed to PowerShell
PowerShell executed a Rust-compiled infostealer

The whole thing was wrapped in a bare except block so any errors failed silently. You'd never know it ran.

What the malware actually grabbed

The Rust-based infostealer went for:

Saved passwords and session cookies from Chrome, Chromium forks, and Firefox
Discord local storage (which includes auth tokens — effectively Discord account access)
Cryptocurrency wallet files
FileZilla FTP credentials
Host system info

It also ran anti-analysis checks before doing anything — looking for VirtualBox, VMware, QEMU, and Xen to detect sandboxes, checking for debuggers, and disabling Windows Antimalware Scan Interface (AMSI) to avoid behavioral detection. This wasn't a script-kiddie job. Whoever built this knew what they were doing.

Why this is specifically a small org problem

Here's the thing. Large enterprises with proper procurement controls — where every software dependency goes through a security review before touching a dev machine — are actually somewhat protected from this class of attack. Not because they're smarter, but because their process slows things down.

Small orgs don't have that friction. And right now, small orgs are moving fast with open-source AI. The economics are compelling: instead of paying API costs at scale, you grab a capable open-weight model, host it yourself, and you're off to the races. Hugging Face is where you go to do that. It's the npm registry of AI models.

And just like npm has had its poisoned package moments, Hugging Face now has this.

The attack exploited something very specific: the way small teams evaluate whether to trust a resource. The heuristics are: Is it popular? Does it look official? Does the description match what I expected? All three were gamed here. 244K downloads in 18 hours — almost certainly artificially inflated — made the repo look like the obvious legitimate choice. The description was copy-pasted verbatim. The name was one namespace away from the real thing.

This is social engineering, but for developer workflows instead of executive inboxes.

If you were one of the 244K

If someone on your team cloned Open-OSS/privacy-filter on a Windows machine and ran anything from it, treat that machine as fully compromised. That means:

Rotate every password stored in the browser on that machine
Revoke and reissue any API keys that were accessible from it
Change Discord credentials if Discord was open
Check your crypto wallets for unauthorized transactions
Audit what that machine could access on your internal network

Don't just run antivirus and call it clean. The infostealer was specifically built to evade static and behavioral detection. The data was already exfiltrated before you noticed anything.

What to do going forward

Verify the publisher namespace, every time. openai/ is not the same as Open-OSS/. Hugging Face namespaces are how you identify the actual source. Get in the habit of checking the full repository path, not just the repo name.

Never run .bat files or loader.py from a model repo. Legitimate models don't need a custom loader script to download weights. If a readme is telling you to run a Python script or a shell script as part of setup, that's a red flag.

Pin to a specific commit hash. Instead of cloning main, specify the exact commit you want: git clone --branch main --single-branch https://huggingface.co/openai/privacy-filter && git checkout <commit-hash>. This doesn't protect you from a malicious repo, but it protects you from a legitimate repo getting quietly backdoored after you first used it.

Use verified or checksum-validated sources. If a model has a published SHA256 or MD5 checksum from a trustworthy source, verify it. If it doesn't, at least cross-reference the download count and date history against what you'd expect.

Check the file list before running anything. The presence of loader.py, start.bat, or any executable that isn't weights or config should make you pause. Open it and read it before running it.

The pattern to watch for

This is going to get worse before it gets better. The AI tooling ecosystem is growing faster than security practices can catch up. Hugging Face has moderation, but their trending algorithm can be gamed with inflated downloads the same way YouTube trending used to get gamed with bot views.

The attack surface is: your developer or ops person is excited about a new open-source model, they search for it, they find what looks like the right thing, and they run it. That's the whole attack. It requires nothing more than that.

Big tech companies are starting to implement internal model vetting pipelines. Most small orgs don't have that. If you're running AI workloads and pulling models from public registries without a verification step, that gap is exactly what attackers are targeting.