Noema - Local AI Assistant

Does Noema require internet to work?

No. After installing the app and downloading at least one model, Noema's core features (chatting with the model, using local datasets) do not require an internet connection. You can use it completely offline. Internet is only needed for optional actions like searching the web, downloading new models, or fetching datasets. Many users use Noema in airplane mode to ensure nothing goes out – and it works perfectly.

What data does Noema send over the internet?

By default, none of your conversation data is sent out. When you do use internet features:

• Model downloads obviously fetch data from model hubs (Hugging Face). This includes connecting to huggingface.co to download model files or to search the model list.
• Dataset downloads fetch from their sources (Open Textbook Library links or local imports).
• Web search sends your query to the search API and gets results. This is done securely and without personal info attached.

Noema does not upload your prompts, chat history, or any private data during these operations – it only pulls down content. There is no telemetry that sends your usage or keystrokes anywhere. For update checks, the app only pings a server to see if a newer app version is available (on desktop this happens, but on iOS the App Store handles updates).

How is my privacy protected?

In multiple ways:

Local Processing

All AI inference is on-device, so your prompts and the model's output never leave your phone/tablet. This is unlike cloud AI services that send your data to servers.

No Account Required

Noema doesn't ask you to log in or provide personal details. It doesn't even know who you are. There is no usage tracking tied to an account.

No Analytics/Tracking

The app does not contain third-party analytics SDKs.

Open Source Core

Parts of Noema's engine are based on open-source libraries (llama.cpp, etc.).

On-Device Storage

All your model files, chats, and datasets are stored on your device in the app sandbox. iOS sandboxing means no other app can read Noema's data. If you lock your device with a passcode, that also helps protect the data at rest.

What are the limits of running locally?

The main limitations are speed and memory compared to cloud AI. A phone is not as powerful as a datacenter GPU. This means generation will be slower (a few tokens per second on big models) and very large models (like 70B parameters) are not feasible on mobile yet. However, within those constraints, Noema can do a lot:

• With 4B to 7B models, you can get decent interactive speeds (1-4 words per second).
• There are no hard usage limits – since it's offline, you're not rate-limited or restricted by tokens. You can run as much as your battery and patience allows.

Is Noema free?

Noema is free forever for all core features including local AI chat, model downloads, and dataset integration. There is an optional subscription available that removes web search limits and unlocks unlimited online features for power users who need extensive web integration.

Model XYZ isn't working / won't load – why?

A few troubleshooting tips:

• Check device memory. If a model fails to load, you are out of RAM. On iPhones, closing background apps can free memory. On iPads, likewise. If you try a 13B on an 8GB device, it will crash – in that case use a 7B or a more quantized 13B (like 5-bit or 4-bit).
• Some models have specific quirks (e.g., requires a special tokenizer). Noema handles this for most models, but very unusual architectures are not supported.

For any model downloaded within Noema, these issues do not occur because the app filters to supported models. It's mainly when sideloading that you must be careful.

The answers are wrong or nonsensical. Is something broken?

Not broken; remember these models have limitations. Always verify critical information. If accuracy is low:

• Try a better model (larger or fine-tuned for your task).
• If it's supposed to use your dataset but didn't, ensure the dataset was selected and has the info (the query didn't match any chunk well, or the embedding missed something).
• Try rephrasing the question or providing more context.
• Check if the model hallucinated – if so, consider enabling web search or using a dataset with the factual info.

Documentation

Offline Operation and Privacy FAQ