Changelog

    Release 2.2

    Released April 23, 2026

    Gemma 4 support, stronger imports and downloads, smarter retrieval, and chat workflow polish

    Release 2.2 improves model compatibility, import reliability, document retrieval, and in-chat transparency. This update adds Gemma 4 support through a refreshed llama.cpp core, strengthens GGUF import detection, makes CML downloads easier to follow, improves how large-context models use PDFs and long documents, expands memory and system prompt customization, and ships a broad round of chat and settings fixes.

    Highlights

    • llama.cpp has been updated for Gemma 4 support, including fixes for previously known Gemma 4 issues.
    • GGUF import is more reliable, with better detection for chat templates, JSON configs, and multimodal projector files.
    • CML model downloads now show clearer progress so it is easier to understand what is happening while a model is being fetched.
    • Smart retrieval makes better use of available context for PDFs and long documents, especially on large-context models.
    • A new Prompt Processing card adds live progress feedback in chat, and stuck 0% progress and post-tool placement issues have been fixed.
    • Memory and system prompt customization are now supported, giving you more control over how Noema behaves.
    • Curated models have been refreshed with Gemma 4 and Qwen 3 1.7B support.
    • Model Settings scrolling, VRAM estimates, and maximum context recommendations have been corrected to better reflect current memory-fit and KV cache quantization behavior.

    Past releases

    Catch up on earlier Noema updates and explore the releases that paved the way for today's improvements.