Date Created: 2025-05-05
By: 16BitMiker
[ BACK.. ]
OpenWebUI offers a powerful and intuitive interface for interacting with large language models (LLMs), whether you're running them locally via Ollama or accessing cloud-based services like OpenAI. But what really sets it apart is its extensibility: youโre not limited to the browser UI. OpenWebUI provides a robust API for customizing assistants, integrating tools, and enabling Retrieval-Augmented Generation (RAG).
In this post, weโll walk through how to build a RAG-enabled assistant in OpenWebUI using both local and cloud models. Then, weโll show how to access it programmatically using the OpenAI-compatible API.
Letโs dive in. ๐
In OpenWebUI, a โcustom assistantโ is a user-defined configuration that wraps around a base model. Think of it as an intelligent persona with memory, tools, and specific knowledge sources.
Each assistant can include:
Base Model: Local (e.g., LLaMA 3 via Ollama) or remote (e.g., OpenAIโs GPT-4).
System Prompt: Defines tone, behavior, and role.
Knowledge Bases: Enable RAG by adding relevant documents.
Tool Integrations: Access web search, code execution, or custom plugins.
Model Parameters: Control creativity, response length, and repetition.
Prompt Suggestions: Offer pre-defined queries to streamline UX.
Once defined, the assistant can be used through the UI or via API.
OpenWebUI is backend-agnostic โ you can connect it to local or cloud-hosted models.
Ollama makes it easy to run models like LLaMA 3, Mistral, or Qwen entirely on your machine.
โ To get started:
# Install Ollama from https://ollama.com/download
ollama run llama3
OpenWebUI will auto-detect your local Ollama instance (typically available at http://localhost:11434
) and make it available as a model backend.
๐๏ธ Ideal for: Air-gapped environments, privacy-conscious deployments, and tinkering.
If you want access to high-performance models like GPT-4 or Claude, you can connect OpenWebUI to any OpenAI-compatible API.
โ Example: Set up OpenAI
Go to Workspace > Models
Click + Create Model
Choose โOpenAI-Compatibleโ
Enter your endpoint (e.g., https://api.openai.com/v1
) and your API key
OpenWebUI will treat this like any other backend, standardizing interactions.
๐ฆ This same approach works with self-hosted APIs like vLLM, LM Studio, or FastChat.
RAG enhances your assistant by grounding its answers in your documents. OpenWebUI embeds these documents and indexes them for semantic retrieval.
โ To add a knowledge base:
Go to Workspace > Knowledge
Click + Create Knowledge Base
Give it a name and description
Upload PDFs, Markdown, or plain text files
OpenWebUI will process them into embeddings using its vector store backend
๐ These documents will be automatically referenced by the assistant during conversations.
Now that your model backend and knowledge base are ready, letโs create a custom assistant.
Navigate to Workspace > Models
Click + Create a model
Model Name: Descriptive name for the assistant (e.g., HR-Bot
, Legal Counsel
)
Base Model: Choose the model backend (e.g., llama3
, gpt-4
)
System Prompt: Define role and behavior (e.g., โYou are a helpful HR assistant...โ)
Avatar Image: Optional image for UI
Prompt Suggestions: Preload useful queries like โWhat is our remote work policy?โ
Under the โKnowledgeโ tab, link the knowledge base you created earlier.
Add tools like:
Web search
Code execution
Custom API calls
Temperature: Controls creativity (0.0 = deterministic, 1.0 = creative)
Max Tokens: Limit response length
Frequency Penalty: Discourage repetition
โ Save your assistant. Itโs now available in OpenWebUI and via API.
To call your assistant from code, youโll need an access token.
Go to Settings > Account > API Keys
Click โ to generate a key
Copy and store it securely
๐ This API key will be used as a Bearer token in your HTTP requests.
OpenWebUI exposes an OpenAI-style endpoint:
xxxxxxxxxx
curl -X POST http://localhost:3000/api/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "your-custom-model-id",
"messages": [
{
"role": "user",
"content": "What is our VPN access policy?"
}
]
}'
๐ Notes:
Replace "your-custom-model-id"
with the assistantโs ID
This endpoint is compatible with OpenAI clients and SDKs
Assistant settings (prompt, tools, knowledge) are applied automatically
For real-time interaction (e.g., live chat):
xxxxxxxxxx
{
"model": "your-custom-model-id",
"messages": [ ],
"stream": true
}
Use a client that supports Server-Sent Events (SSE) to process streamed tokens.
๐ Great for dashboards, bots, and typing animations.
Hereโs a simple way to query your assistant from Python:
โximport requests
# Your OpenWebUI API key and assistant model ID
API_KEY = "your-api-key"
MODEL_ID = "your-custom-model-id"
# HTTP headers including authorization
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
# The payload to send to the assistant
payload = {
"model": MODEL_ID,
"messages": [
{"role": "user", "content": "Explain our password policy."}
]
}
# Make the POST request to OpenWebUI's chat completions endpoint
response = requests.post("http://localhost:3000/api/chat/completions", headers=headers, json=payload)
# Output the assistant's response
print(response.json())
๐ This works whether your assistant is backed by a local model (via Ollama) or a cloud model (like OpenAI's GPT).
OpenWebUI + API opens up a wide range of possibilities:
๐ฅ Internal Tools:
HR assistant with policy documents
Legal bot with contract templates
IT helpdesk with workflow guides
๐ System Integrations:
Slack bots
Web dashboards
CLI tools
๐๏ธ Offline-First:
Fully air-gapped deployments using Ollama
No cloud dependency
OpenWebUI transforms how you interact with LLMsโwhether through its sleek UI or programmable API. By combining local and cloud models, integrating RAG, and enabling tool use, it becomes a powerful development environment for intelligent assistants.
With just a few steps, you can create domain-specific assistants that are fast, private, and deeply integrated into your workflows.
โ Whether you're building a chatbot, automating support, or querying internal documents, OpenWebUI provides the flexibility and control you need.
Happy building! ๐ง ๐ป