FigJam Diagram: RAG Platform — Document Ingestion & Vector Search (expires 2026-04-13)
Qdrant vector database with automated document ingestion pipelines. Provides semantic search and retrieval-augmented generation (RAG) capabilities for OpenClaw and Polymarket Lab.
| Namespace | rag |
| Qdrant endpoint | qdrant.rag.svc.cluster.local:6333 (cluster-internal) |
| Internal URL | https://qdrant.k3s.internal.strommen.systems |
| Embedding model | nomic-embed-text via Ollama at 192.168.1.214:11434 (rag-ingester + rag-bridge) |
| Vector collections | documents, polymarket-markets |
| Storage | Longhorn PVC, 50Gi RWO |
| Qdrant image | qdrant/qdrant:v1.14.1 |
Note: The Ollama instance at
192.168.1.214:11434is an external host on the LAN (not in the k3s cluster). It runsnomic-embed-textfor generating embeddings. Therepo-ingesteruses a different path — see below.
| Resource | Type | Schedule / Port | Purpose |
|---|---|---|---|
qdrant |
StatefulSet (1 replica) | :6333 HTTP, :6334 gRPC |
Vector database |
rag-ingester |
CronJob | */10 * * * * |
Ingest NFS docs → documents Qdrant collection |
repo-ingester |
CronJob | 0 4 * * * (daily 04:00 UTC) |
Ingest git repo → pgvector in openclaw-memory-db |
rag-bridge |
CronJob (polymarket-lab ns) | */15 * * * * |
Ingest market data → polymarket-markets Qdrant collection |
repo-ingester does NOT write to Qdrant. It clones the
home_k3s_clusterGitHub repo, chunks files, embeds via AWS Bedrock Titan Embed Text V1 (1536-dim), and stores vectors in therag_documentstable inopenclaw-memory-db(pgvector). Qdrant'sdocumentscollection is only fed byrag-ingester.
documentsGeneral-purpose document collection. Ingested by rag-ingester (every 10 min) from NFS staging dir.
rag-ingesternomic-embed-text (768 dimensions, Ollama)polymarket-marketsPrediction market data for research and strategy. Ingested by the rag-bridge CronJob in the polymarket-lab namespace.
rag-bridge (every 15 min, from TimescaleDB LATERAL join for latest snapshot per market, top 500 by 24h volume)nomic-embed-text (768 dimensions, Ollama)rag_documents in openclaw-memory-db (open-webui namespace)home_k3s_cluster GitHub repo — YAML, Markdown, Python, Terraform, shell scripts| Secret | Keys | Purpose |
|---|---|---|
rag-ingester-db-credentials |
DATABASE_URL |
pgvector connection string to openclaw-memory-db |
rag-ingester-aws-credentials |
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY |
Bedrock Titan Embed access |
rag-ingester-github-credentials |
GITHUB_TOKEN |
Optional — only if repo goes private |
| Secret | Keys | Purpose |
|---|---|---|
polymarket-lab-secrets |
POSTGRES_PASSWORD |
TimescaleDB connection |
Qdrant uses a Longhorn PVC (storageClassName: longhorn, 50Gi RWO). StatefulSet runs on amd64 nodes.
# Check collection status
kubectl exec -n rag qdrant-0 -- curl -s http://localhost:6333/collections | jq '.result.collections[].name'
# Collection vector counts
kubectl exec -n rag qdrant-0 -- curl -s http://localhost:6333/collections/documents | jq '.result.vectors_count'
kubectl exec -n rag qdrant-0 -- curl -s http://localhost:6333/collections/polymarket-markets | jq '.result.vectors_count'
# Check CronJob status
kubectl get cronjob -n rag
kubectl get pods -n rag --sort-by=.metadata.creationTimestamp | tail -10
# Port-forward to Qdrant dashboard
kubectl port-forward -n rag svc/qdrant 6333:6333
# Then open: http://localhost:6333/dashboard
# Check rag-bridge (polymarket-lab namespace)
kubectl get cronjob -n polymarket-lab rag-bridge
kubectl get pods -n polymarket-lab -l job-name --sort-by=.metadata.creationTimestamp | tail -5
Qdrant is scraped by Prometheus via a ServiceMonitor on :6333/metrics (every 30s, label release: prometheus).
CronJob success/failure tracked via Kubernetes job status. Failed ingestion jobs will:
KubeJobFailed Prometheus alertkubernetes/apps/rag/
00-namespace.yaml — Namespace definition
qdrant.yaml — Qdrant StatefulSet, Services, 50Gi Longhorn PVC, Ingress, ServiceMonitor
nfs-pv.yaml — NFS PersistentVolume for staging docs (NAS /volume1/rag-staging)
ingester.yaml — rag-ingester CronJob (*/10 * * * *)
repo-ingester.yaml — repo-ingester CronJob (Bedrock → pgvector, daily 04:00 UTC)
kubernetes/apps/polymarket-lab/rag-bridge.yaml
— rag-bridge CronJob (*/15 * * * *, polymarket-markets collection)
[NOT in polymarket-lab/kustomization.yaml — apply manually]
polymarket-markets collection; hosts rag-bridge CronJobdocuments collection for RAG context