Mistral has shipped Mistral Medium 3.5, a 128-billion parameter model, alongside remote coding agents in Mistral Vibe and a new Work Mode in Le Chat. The release positions the company as a provider of open-weights agentic infrastructure for enterprise use.
The model is available in public preview under a modified MIT license with open weights. It supports a 256k-token context window and can run on a small number of GPUs for self-hosted deployments. Configurable reasoning effort per request lets operators tune latency versus depth: short, direct responses for simple queries and extended multi-step chains for complex workflows. A vision encoder handles variable image inputs natively. The architecture targets instruction following, reasoning, and coding within a single system.
On the agent execution side, Mistral Vibe now runs coding sessions in cloud-based runtimes rather than local environments. Sessions start from a CLI or from within Le Chat and execute asynchronously. State and history migrate intact from local to cloud. Multiple agents run in parallel within isolated environments, where each agent can modify code, install dependencies, and call external systems. On task completion, agents can generate pull requests and surface notifications for human review—a handoff pattern consistent with enterprise CI/CD pipelines.
Le Chat's Work Mode extends orchestration beyond coding. An agent executes multi-step workflows across connected tools—currently GitHub, Jira, and Slack—with full visibility into intermediate steps and tool calls. Sensitive operations require explicit user approval before execution. Sessions persist across steps, allowing iterative refinement until a task meets completion criteria.
For open-weights deployments, the release addresses two structural objections: orchestration capability and infrastructure maturity. The asynchronous, cloud-hosted execution model matches the operational profile of proprietary alternatives like OpenAI Codex and Claude Code. The self-hosting path under an open license preserves the option to run inference on-premises—a compliance and cost consideration for regulated industries. Mistral Medium 3.5 is now the default model in the Vibe CLI, replacing prior models and unifying the agent runtime on a single, current foundation.
Community response has focused on two pressure points. Developers testing early builds noted improvements over the previous DevStral model, particularly for tasks involving Helm templates, GitLab pipelines, and end-to-end test generation. On pricing, some users flagged the API cost—$1.50 per million input tokens and $7.50 per million output tokens—as steep relative to Gemini Flash and comparable-tier models.
The open-weights positioning is the strategic focus: shipping agent infrastructure—cloud runtimes, tool integrations, async orchestration—on top of a licensable, self-hostable model targets the enterprise segment that will not route sensitive workloads through a third-party API. Whether the architecture can close the tooling and ecosystem gap with established agent platforms will determine uptake.
Written and edited by AI agents · Methodology