FigJam Diagram: Open WebUI — Strommen AI Chat Gateway (expires 2026-04-13)
Self-hosted AI chat interface ("Strommen AI") backed by LiteLLM proxy routing to AWS Bedrock, Anthropic direct, and OpenRouter. Deployed via Helm in the open-webui namespace.
| Internal URL | https://chat.k3s.internal.strommen.systems |
| Public URL | https://chat.k3s.strommen.systems |
| Namespace | open-webui |
| Chart | open-webui/open-webui v12.3.0 (app v0.8.3) |
| App Name | Strommen AI |
| Default Model | claude-sonnet |
| Storage | 5Gi longhorn-encrypted PVC |
Auth: Public URL is protected by Authentik forwardAuth (
openclaw-chat-publicIngressRoute inpublic-ingressnamespace, migrated from OAuth2 Proxy on 2026-04-04). Users authenticate via Google OAuth2 through Authentik. Open WebUI itself has signup and password login disabled — Authentik controls all access.
| Route | Auth Method |
|---|---|
chat.k3s.internal.strommen.systems |
None (cluster-internal only) |
chat.k3s.strommen.systems |
Authentik forwardAuth (openclaw-chat-public IngressRoute) — migrated 2026-04-04 |
Access is controlled via Authentik's app-openclaw group. Google account must be in the group to authenticate.
ENABLE_SIGNUP: "false" — no self-registrationENABLE_LOGIN_FORM: "false" — no username/password loginENABLE_ADMIN_CHAT_ACCESS: "false")Legacy note: Prior to 2026-04-04,
chat.k3s.strommen.systemsused OAuth2 Proxy with trusted-header SSO (WEBUI_AUTH_TRUSTED_EMAIL_HEADER: X-Auth-Request-Email). The OAuth2 Proxy deployment is scheduled for removal on 2026-04-11 after 7-day soak. After removal, update this page.
LiteLLM runs at litellm.open-webui.svc.cluster.local:4000 with an OpenAI-compatible API. All services calling LiteLLM must use the alias names below — never raw model IDs (e.g. anthropic.claude-sonnet-4-5 is not valid in manifests; use claude-sonnet). Per-token prices set at 110% of actual cost (10% margin) for spend tracking.
| Model ID | Backend | Description |
|---|---|---|
llama-3.1-8b |
Ollama (LAN) | General reasoning, Q&A, summarization |
qwen2.5-coder-7b |
Ollama (LAN) | Code generation, refactoring |
| Model ID | Description |
|---|---|
claude-sonnet |
Claude Sonnet 4.6 — complex reasoning, code review |
claude-haiku |
Claude Haiku 4.5 — fast, cost-efficient tasks |
claude-opus |
Claude Opus 4.6 — highest capability, most expensive |
| Model ID | Description |
|---|---|
nova-micro |
Amazon Nova Micro — ultra-cheap classification |
nova-lite |
Amazon Nova Lite — balanced cost/quality |
nova-canvas |
Amazon Nova Canvas — image generation ($0.04/image) |
nova-reel |
Amazon Nova Reel — video generation |
titan-embed |
Titan Embed Text v1 — 1536-dim embeddings |
| Model ID | Description |
|---|---|
or-deepseek-r1 |
DeepSeek R1 0528 — reasoning fallback |
or-deepseek-v3 |
DeepSeek V3.2 — budget general-purpose |
or-gemini-flash |
Gemini 3 Flash — fast, mid-cost |
or-gemini-lite |
Gemini 3.1 Flash Lite — ultra-budget, 1M context |
or-llama-maverick |
Llama 4 Maverick — open-weight, 1M context |
or-qwen-flash |
Qwen3.5 Flash — ultra-budget, 1M context |
or-devstral |
Devstral — code-focused, 262K context |
or-free |
Zero-cost for background jobs |
Enabled via Nova Canvas through LiteLLM. Routes through http://litellm.open-webui.svc.cluster.local:4000/v1, model nova-canvas, output 1024x1024.
| Env Var | Value | Purpose |
|---|---|---|
WEBUI_AUTH |
"true" |
Auth enabled |
ENABLE_SIGNUP |
"false" |
No self-registration |
ADMIN_EMAIL |
mstrommen@gmail.com |
First-user admin |
DEFAULT_MODELS |
claude-sonnet |
Default model on load |
WEBUI_AUTH_TRUSTED_EMAIL_HEADER |
X-authentik-email |
Authentik email passthrough (changed from X-Auth-Request-Email on 2026-04-05) |
ENABLE_FORWARD_USER_INFO_HEADERS |
"true" |
Per-user cost tracking via LiteLLM |
ENABLE_IMAGE_GENERATION |
"true" |
Image gen via Nova Canvas |
RAG_EMBEDDING_ENGINE |
openai (empty model) |
RAG disabled — prevents OOM from local embedding model download. Set RAG_EMBEDDING_MODEL to nomic-embed-text if RAG is needed later. |
ENABLE_ADMIN_CHAT_ACCESS |
"false" |
Privacy: admin cannot read user chat history |
ENABLE_ADMIN_EXPORT |
"false" |
Privacy: admin cannot export user data |
ENABLE_COMMUNITY_SHARING |
"false" |
No external sharing |
Full values: kubernetes/apps/open-webui/open-webui-values.yaml
| Secret | Keys | Purpose |
|---|---|---|
litellm-secrets |
anthropic-api-key, openrouter-api-key, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY |
LiteLLM model credentials |
Bootstrap:
kubectl create secret generic litellm-secrets -n open-webui \
--from-literal=anthropic-api-key=<sk-ant-...> \
--from-literal=openrouter-api-key=<sk-or-...> \
--from-literal=AWS_ACCESS_KEY_ID=<bedrock-iam-key> \
--from-literal=AWS_SECRET_ACCESS_KEY=<bedrock-iam-secret>
| Job | Schedule (UTC) | Purpose |
|---|---|---|
morning-mood-boost |
Daily 11:00 UTC | Good news brief via NewsAPI → Slack/chat |
aws-daily-brief |
Daily 13:00 UTC | AWS cost report → Slack |
nightly-wiki-review |
Daily 04:30 UTC | Wiki.js documentation audit |
/health on port 8080 via pod annotations/metrics on port 4000 (spend tracking, token counts, per-model latency)kubernetes/apps/open-webui/grafana-dashboard.yamlkubernetes/apps/open-webui/
open-webui-values.yaml — Helm values (chart, auth, models, env vars)
litellm.yaml — LiteLLM Deployment, ConfigMap (model config), Service
longhorn-encrypted.yaml — StorageClass for encrypted PVC
grafana-dashboard.yaml — LiteLLM spend/latency Grafana dashboard
kubernetes/apps/morning-mood-boost/cronjob.yaml
kubernetes/apps/mat-claw-cost-reporter/cronjobs.yaml
kubernetes/apps/nightly-wiki-review/cronjob.yaml