AI Provider for llama.cpp
This plugin provides llama.cpp integration for the WordPress AI Client. It lets WordPress sites use large language models running via a llama.cpp server for text generation and other AI capabilities.
llama.cpp exposes an OpenAI-compatible API, and this provider uses that API to communicate with any GGUF model loaded into your llama.cpp server.
Features:
- Text generation with any llama.cpp-loaded model
- Automatic model discovery from your llama.cpp server
- Function calling support
- Structured output (JSON mode) support
- Settings page to configure the server URL (default: http://127.0.0.1:8080)
- Works without an API key for local instances
Requirements:
- PHP 7.4 or higher
- WordPress AI Client plugin must be installed and activated
- llama.cpp server running locally or on a remote host
Getting Started
What do I need before using this plugin?
You need a running llama.cpp server (local or remote) and the WordPress AI Client plugin installed and activated.
How do I install llama.cpp?
On macOS run brew install llama.cpp. For other platforms, see the official docs at https://llama-cpp.com/download/
Where do I get a model?
Download a GGUF model from Hugging Face. TinyLlama 1.1B (Q4_K_M, ~636 MB) is a good starting point for testing.
How do I start the server?
Run llama-server --models-dir ~/models. The server starts on http://127.0.0.1:8080 by default.
How do I connect WordPress to my llama.cpp server?
Go to Settings > llama.cpp and enter your server URL. Leave it blank to use the default (http://127.0.0.1:8080).
For full step-by-step setup instructions, see the Getting Started Guide.
