Model Settings
Fine-tune your AI model behavior with comprehensive settings and customization options.
Basic Model Configuration
Model Selection
- • Choose from downloaded models
- • Switch models per conversation
- • View model specifications and capabilities
- • Set default models for different tasks
Devices with chips older than A13 Bionic have limited GPU acceleration and cannot run MLX, which slows down GGUF model inference. For these devices, choose small language models (SLMs) and conservative settings.
Generation Parameters
Temperature
Controls randomness and creativity in responses.
- • 0.1-0.3: Very focused, deterministic responses
- • 0.4-0.7: Balanced creativity and consistency
- • 0.8-1.0: High creativity, more varied responses
- • 1.0+: Maximum creativity, chaotic responses
Advanced Settings
Context Management
- • Context window size
- • Context overflow handling
- • Memory optimization
- • Conversation summarization
Performance Tuning
- • Thread count adjustment
- • GPU acceleration settings
- • Memory usage limits
- • Batch size optimization
💡 Pro Tips
- • Start with default settings and adjust incrementally
- • Lower temperature for factual queries, higher for creative tasks
- • Save custom presets for different use cases
- • Monitor performance impact of advanced settings
- • Use shorter max tokens for faster responses on slower devices