By itself, an LLM is just a block of data. The software required to perform inference with that data is typically called a “Backend”.
Sometimes the interface includes a backend (e.g. LocalTavern), other times it is a strict “Frontend” that requires a separate “Backend” to perform inference (e.g. SillyTavern).
Finally, most interfaces can be connected to 3rd-party APIs which provide backend capability for you. This can be incredibly useful when you want to run models which require more hardware than you have locally available.
These backends run locally. As a result, their capabilities are directly related to the quality of the hardware you have available.
Additionally, each of these require a separate model to operate.
Each of the following perform inference without needing additional software. User friendliness is not the first priority with these utilities.
These tools are designed to provide a user-friendly layer which handles backend needs and engine management simultaneously. If you're not sure what to pick, this is a good place to start.
| Name | Notes |
|---|---|
| Oft-Recommended Managers | |
| text-generation-webui (Oobabooga) | Offers all other engines here and more. |
| koboldcpp | Good UI, koboldcpp engine only. |
| Other Managers | |
| LocalAI | Provides OpenAI-compatible API. |
| ollama | Wraps llama.cpp. |
| tabbyAPI | Official API server for ExLlama engines. |
These are essentially remote backends. Everything you send and receive is, at minimum, available to the provider(s). Censorship is often encountered to varying degrees.
| Name | Notes |
|---|---|
| AI Horde | Free, with limited performance and models. |
| OpenRouter | Large model selection. Low(er) cost. |
| mancer | Low/no censorship. Free tier available. |
| NovelAI | Low/no censorship. |
| Pollinations | Free tier available with ads. |
Additionally, most commercial APIs can be utilized such as ChatGPT, Claude, Perplexity, etc.
SillyTavern's page on API Connections.