Czech developer tool provider JetBrains has just taken a step that fits well with the latest trend in the artificial intelligence market: a shift away from giant, universal systems towards deep specialisation. The company has made its latest language model, Mellum2, available under an open source licence. It hits the Hugging Face platform under the Apache 2.0 licence, paving the way for widespread commercial adoption and local corporate deployments. The move directly addresses the growing business demand to optimise the operational costs of generative systems in production environments.
Mellum2 represents a new category of so-called ‘focal models’ – high-speed, tailored components to handle massive, repetitive tasks in advanced systems. While the first generation of Mellum focused solely on code completion, the new version handles both code and natural language. However, this is not an attempt to create another rival to frontier models. JetBrains has deliberately eschewed multimodality to maximise performance in key areas for developers: query routing, agent orchestration and low-latency RAG pipelines.
Key to Mellum2’s cost advantage is the Mixture-of-Experts (MoE) architecture. Although the model has 12 billion parameters, only 2.5 billion of them are involved in processing each token. In real-time production environments, Mellum2 cuts inference time by more than half compared to direct competitors, while maintaining high response quality and precision in logic and mathematical tasks.
