Z AI Character Card Wiki

The Last Letter in Personalized Artificial Intelligence

User Tools

Site Tools


backends

This is an old revision of the document!


Backends

By itself, an LLM is just a block of data. The software required to perform inference with that data is typically called a “Backend”.

Sometimes the interface includes a backend (e.g. LocalTavern), other times it is a strict “Frontend” that requires a separate “Backend” to perform inference (e.g. SillyTavern).

Finally, most interfaces can be connected to 3rd-party APIs which provide backend capability for you. This can be incredibly useful when you want to run models which require more hardware than you have locally available.

Local Backends

These backends run locally. As a result, their capabilities are directly related to the quality of the hardware you have available.

Additionally, each of these require a separate model to operate.

Inference Engines

Each of the following perform inference without needing additional software. User friendliness is not the first priority with these utilities.

Name Desktop Support Notes
llama.cpp x Reference backend. Invented GGUF format.
koboldcpp x Based on llama.cpp. Good UI. focused on RP.
ExLlamaV3 x Created exl3 format, focused on GPU performance.
ik_llama x Improved CPU performance.

Engine Manangers

These tools are designed to provide a user-friendly layer which handles backend needs and engine management simultaneously. If you're not sure what to pick, this is a good place to start.

Name Desktop Support Notes
llama.cpp macOS, Linux, Windows
koboldcpp macOS, Linux, Windows

Note: This list does not include backends that can directly accept Character Cards - those are considered interfaces.

3rd-Party API providers

These are essentially remote backends. Everything you send and receive is, at minimum, available to the provider(s).

  • openrouter
  • ai horde

More information to come.

backends.1773763136.txt.gz · Last modified: by tys