Noema
    Noema

    Model Settings

    Tailor how Noema responds by tweaking generation parameters, context behavior, and performance controls. Start simple and layer in advanced tuning as you learn what each model prefers.

    Getting started

    Pick the right model for the job

    • Switch models per conversation without losing chat history.
    • Review specs such as parameter count, context window, and quantization before activating.
    • Set a default model for everyday chats and keep alternates bookmarked for specialty tasks.
    Generation controls

    Temperature, top-p, and more

    COMING SOON

    Temperature governs creativity. Keep it between 0.2–0.4 for focused answers, 0.5–0.7 for balanced ideation, and push higher when you want playful brainstorming.

    Combine temperature with Top-p or Top-k to further filter the token sampling space. Noema surfaces inline descriptions so you can adjust with confidence.

    Context handling

    Stay within the window

    • Set maximum tokens to balance depth with response speed.
    • Enable automatic conversation summarization when you approach the context limit.
    • Allow Noema to trim older turns first for longer work sessions.
    Performance

    Dial in resource usage

    Adjust thread counts and GPU offload to match your device’s capabilities.

    Lower batch sizes on older hardware to keep inference stable, and monitor memory from the status sheet.

    Tips for great results

    • Start with presets, then tweak one parameter at a time to isolate impact.
    • Save custom profiles for workflows like summarization, creative writing, or coding.
    • Drop temperature and raise Top-p when you need factual accuracy.
    • Switch to lighter quantizations if responses feel sluggish on older devices.