All Insights

Everyone's Mad at Mistral's New Model. They're Comparing the Wrong Thing.

CivSafe Team·May 1, 2026·6 min read

The reaction to Mistral's new model was pretty predictable. Mistral AI dropped Medium 3.5 — a 128-billion-parameter, fully open-weights model — yesterday, and within hours the AI community was doing the thing it always does: comparing it to the cheapest available alternative and declaring it overpriced.

The critics aren't wrong about the math. At $1.50 per million input tokens and $7.50 per million output via API, Medium 3.5 costs roughly 10x more than DeepSeek V4, which sits at $0.14 per million input tokens. On raw benchmarks, DeepSeek either matches or edges out Medium 3.5 on most standard evaluations. The Hacker News thread is not subtle about the disappointment.

But the Hacker News crowd is optimizing for a different problem than most of CivSafe's clients.

What Mistral Actually Built

Medium 3.5 is a dense 128B model — not a mixture-of-experts architecture — with a 256K context window. Dense means more predictable inference behavior and easier deployment. It handles instruction-following, reasoning, and code in a single set of weights. The model card shows it beating Claude Sonnet 4.5 on several coding and reasoning benchmarks. Not the top of the leaderboard, but solidly in the tier just below the frontier.

The weights ship under a Modified MIT License. Free to download, free to self-host, free to use commercially for most organizations. The exceptions kick in at revenue thresholds that don't apply to NGOs, public sector departments, or SMBs under roughly 100M€ in annual revenue.

And Mistral says you can run it on "as few as four GPUs." In practice, that means four high-VRAM data center GPUs — think NVIDIA H100 80GB or equivalent. Significant hardware investment, but it's a one-time cost. After that, the model lives inside your infrastructure, handles as much inference as you throw at it, and never sends data anywhere external.

The Argument Nobody Is Having Online

Here's what the "just use DeepSeek" take glosses over: DeepSeek is a Chinese company with infrastructure in China. GLM-5.1, which briefly topped SWE-bench Pro last month, was trained on Huawei Ascend chips. Both are impressive engineering achievements. And both are non-starters for a significant chunk of the organizations we actually work with.

Canadian federal departments operate under the Treasury Board Directive on Service and Digital, which has real data residency implications for sensitive workloads. Quebec organizations deal with Law 25, with strict rules on personal information consent, storage, and cross-border transfer. Any organization handling EU-resident data lives under GDPR. Health data in Canada flows through PIPEDA and provincial health privacy legislation.

This is not a niche edge case. It's most of the public sector, most of health-adjacent NGOs, and a large portion of financial services SMBs. For these organizations, routing sensitive workloads through DeepSeek's API — or even through American hyperscalers who may store inference logs in jurisdictions outside Canadian control — creates legal exposure that IT and legal will not sign off on.

Mistral is a French company, EU-headquartered, GDPR-native by design. When Medium 3.5 weights run on your server in your datacenter, your data never leaves your infrastructure. No hyperscaler intermediary. No "where does your AI vendor store inference logs" conversation with your lawyers. No data residency question to answer at the next audit.

That's the thing the community isn't impressed by. That's also the thing that matters most to the orgs that have been waiting for exactly this.

What You Can Do With It Now

Alongside the model weights, Mistral shipped Mistral Vibe — an async cloud coding agent powered by Medium 3.5. The idea is straightforward: hand it a well-defined task (refactor this module, write tests for this function, investigate this CI failure, update these dependencies), and it runs in an isolated cloud sandbox while you do something else. When it finishes, it opens a pull request on GitHub and pings you in Slack or Teams.

You review results, not process.

For a 5-person team — the kind we work with regularly — this is genuinely useful. It integrates with GitHub, Linear, Jira, Sentry, Slack, and Teams out of the box. The overnight workflow is straightforward: point Vibe at your current sprint backlog, let it work while your team sleeps, and triage reviewed PRs in the morning instead of starting from scratch.

Mistral also released Work Mode in Le Chat, their chat interface, aimed at coordination-heavy roles. Team leads and operations managers who spend their days triaging between email, Slack, and project management tools can describe a workflow in plain language and let Medium 3.5 execute it across connected tools — catching up across email and messages, diving into internal docs, triaging inboxes into Jira tickets, pushing summaries to the team.

None of this is science fiction. It's working today.

The Decision Framework for Small Orgs

If your organization has no meaningful data compliance constraints, the math is simple: DeepSeek V4 at $0.14/million tokens is exceptional value and you should probably be using it for most things. We do.

But if your organization handles sensitive personal data, serves EU clients, operates in a regulated sector, or has any Canadian government contract that touches protected information — Mistral Medium 3.5 is not expensive. It is the cheapest viable option that you can actually deploy legally.

The choice now looks like this:

No compliance constraints: DeepSeek V4, GLM-5.1, Qwen3 — excellent open models, cheap, powerful. Use them.

Compliance-first, prefer cloud: Mistral's API at $1.50/M input — EU infrastructure, GDPR-native, auditable, no self-hosting overhead.

Compliance-first, self-hosted: Medium 3.5 weights on your own hardware — zero external data exposure, one-time hardware cost, runs indefinitely. The only external dependency is Mistral's license terms, which are transparent.

That third option didn't exist at 128B scale with a Western-provenance open model until yesterday.

The Pattern Worth Watching

Mistral has been doing this longer than anyone in the European AI space. They open-sourced Mistral 7B in 2023 when that was genuinely surprising. They've consistently released weights rather than API-only access, which is a meaningful trust signal for regulated sector deployments.

Medium 3.5 continues that pattern at a scale where it actually competes with frontier models on real tasks. Not at the top of the benchmark leaderboard — but well inside the range where it handles production workloads, document processing, internal knowledge retrieval, and code generation competently.

The organizations that can't use Chinese models for policy reasons and won't accept American hyperscaler lock-in now have a legitimate self-hostable alternative at frontier-adjacent quality. That's not a small thing, even if the pricing discourse is drowning it out.


Figuring out which AI infrastructure model actually fits your data requirements is exactly the kind of problem we work through with teams. We've run these evaluations for NGOs, public sector organizations, and mid-sized businesses across Canada. Get in touch.

CivSafe — Strategic Innovation. Community Impact.