Noema
    Noema

    Model Settings

    Fine-tune your AI model behavior with comprehensive settings and customization options.

    Basic Model Configuration

    Model Selection

    • • Choose from downloaded models
    • • Switch models per conversation
    • • View model specifications and capabilities
    • • Set default models for different tasks

    Devices with chips older than A13 Bionic have limited GPU acceleration and cannot run MLX, which slows down GGUF model inference. For these devices, choose small language models (SLMs) and conservative settings.

    Generation Parameters

    Temperature

    Controls randomness and creativity in responses.

    • 0.1-0.3: Very focused, deterministic responses
    • 0.4-0.7: Balanced creativity and consistency
    • 0.8-1.0: High creativity, more varied responses
    • 1.0+: Maximum creativity, chaotic responses

    Advanced Settings

    Context Management

    • • Context window size
    • • Context overflow handling
    • • Memory optimization
    • • Conversation summarization

    Performance Tuning

    • • Thread count adjustment
    • • GPU acceleration settings
    • • Memory usage limits
    • • Batch size optimization

    💡 Pro Tips

    • • Start with default settings and adjust incrementally
    • • Lower temperature for factual queries, higher for creative tasks
    • • Save custom presets for different use cases
    • • Monitor performance impact of advanced settings
    • • Use shorter max tokens for faster responses on slower devices