Changelog
Release 2.2
Released April 23, 2026Gemma 4 support, stronger imports and downloads, smarter retrieval, and chat workflow polish
Release 2.2 improves model compatibility, import reliability, document retrieval, and in-chat transparency. This update adds Gemma 4 support through a refreshed llama.cpp core, strengthens GGUF import detection, makes CML downloads easier to follow, improves how large-context models use PDFs and long documents, expands memory and system prompt customization, and ships a broad round of chat and settings fixes.
Highlights
- llama.cpp has been updated for Gemma 4 support, including fixes for previously known Gemma 4 issues.
- GGUF import is more reliable, with better detection for chat templates, JSON configs, and multimodal projector files.
- CML model downloads now show clearer progress so it is easier to understand what is happening while a model is being fetched.
- Smart retrieval makes better use of available context for PDFs and long documents, especially on large-context models.
- A new Prompt Processing card adds live progress feedback in chat, and stuck 0% progress and post-tool placement issues have been fixed.
- Memory and system prompt customization are now supported, giving you more control over how Noema behaves.
- Curated models have been refreshed with Gemma 4 and Qwen 3 1.7B support.
- Model Settings scrolling, VRAM estimates, and maximum context recommendations have been corrected to better reflect current memory-fit and KV cache quantization behavior.
Past releases
Catch up on earlier Noema updates and explore the releases that paved the way for today's improvements.