Services Reference#

All services deployed by ArgoCD, with their chart sources, versions, and access methods.

Service catalogue#

Service

Chart / Source

Version

Namespace

Ingress URL

Auth

Purpose

cert-manager

jetstack/cert-manager

v1.19.4

cert-manager

TLS certificate management

cloudflared

Plain manifests

2026.2.0

cloudflared

Cloudflare tunnel connector

echo

Plain manifests

0.9.2

echo

echo.<domain>

None

HTTP echo test service

Grafana + Prometheus

prometheus-community/kube-prometheus-stack

82.4.1

monitoring

grafana.<domain>

OAuth

Monitoring and dashboards

Headlamp

headlamp/headlamp

0.40.0

headlamp

headlamp.<domain>

OAuth

Kubernetes dashboard

ingress-nginx

ingress-nginx/ingress-nginx

4.14.3

ingress-nginx

Ingress controller

kernel-settings

Inline DaemonSet

kube-system

Sysctl tuning for performance

Longhorn

longhorn/longhorn

1.11.0

longhorn

longhorn.<domain>

OAuth

Distributed block storage

oauth2-proxy

oauth2-proxy/oauth2-proxy

7.12.10

oauth2-proxy

oauth2.<domain>

GitHub

OAuth authentication proxy

RKLlama

Helm chart (local)

0.0.4

rkllama

rkllama.<domain>

None

NPU-accelerated LLM server (Rockchip RK1)

llama.cpp

Helm chart (local)

llamacpp

llamacpp.<domain>

CUDA-accelerated LLM server (NVIDIA GPU)

NVIDIA device plugin

nvidia/nvidia-device-plugin

0.18.2

nvidia-device-plugin

Advertises nvidia.com/gpu resources to the scheduler

Open WebUI

open-webui/open-webui

12.5.0

open-webui

open-webui.<domain>

OAuth

ChatGPT-style UI backed by RKLLama and/or llama.cpp

Open Brain MCP

Helm chart (local)

open-brain-mcp

brain.<domain>

OAuth 2.1 (GitHub)

Standalone MCP server for AI memory

Sealed Secrets

bitnami-labs/sealed-secrets

2.18.3

kube-system

Encrypted secrets in Git

Supabase

supabase-community/supabase-kubernetes

supabase

supabase.<domain>, supabase-api.<domain>

OAuth (Studio), x-brain-key (API)

Self-hosted backend-as-a-service platform

Service details#

cert-manager#

Manages TLS certificates via Let’s Encrypt. Uses DNS-01 validation through the Cloudflare API. Includes a ClusterIssuer (letsencrypt-prod) and a SealedSecret for the Cloudflare API token. Resource limits: 50m/128Mi request, 200m/256Mi limit.

Additional manifests: additions/cert-manager/templates/

  • cloudflare-api-token-secret.yaml — SealedSecret for DNS API token

  • issuer-letsencrypt-prod.yaml — ClusterIssuer for production Let’s Encrypt (uses domain_email for ACME notifications)

cloudflared#

Outbound Cloudflare tunnel connector. Runs 2 replicas for availability. Reads the tunnel token from a SealedSecret. Non-root, read-only rootfs. Image pinned to 2026.2.0.

Additional manifests: additions/cloudflared/

  • deployment.yaml — 2-replica Deployment with hardened security context

  • tunnel-secret.yaml — SealedSecret for tunnel token

echo#

Simple HTTP echo service (ealen/echo-server) for testing ingress, TLS, and headers. Exposed publicly via Cloudflare tunnel with ssl-redirect: false. Runs as non-root (65534) with read-only root filesystem. Image pinned to 0.9.2.

Additional manifests: additions/echo/

  • manifests.yaml — Deployment, Service, and Ingress

Grafana + Prometheus (kube-prometheus-stack)#

Full monitoring stack: Prometheus for metrics collection, Grafana for dashboards, Alertmanager for alerts. Protected by OAuth via oauth2-proxy. Longhorn persistent volumes for data (30Gi Grafana, 40Gi Prometheus). Grafana resource limits: 100m/256Mi request, 500m/512Mi limit.

Uses ServerSideApply=true sync option due to large CRDs.

Headlamp#

Modern Kubernetes dashboard. Protected by OAuth via oauth2-proxy. Uses a ServiceAccount with cluster-admin binding and a long-lived token Secret. Resource limits: 50m/128Mi request, 200m/256Mi limit.

Additional manifests: additions/dashboard/

  • rbac.yaml — ServiceAccount, ClusterRoleBinding, long-lived token Secret

ingress-nginx#

NGINX ingress controller. Admission webhooks are disabled. Resource limits: 100m/256Mi request, 500m/512Mi limit. PodDisruptionBudget: minAvailable 1.

kernel-settings#

DaemonSet that applies system tuning on all nodes:

  • Sets rmem_max and wmem_max to 7500000 (network buffers)

  • Blacklists Longhorn iSCSI devices from multipathd

All busybox images pinned to 1.37.

Longhorn#

Distributed block storage with replication, snapshots, and backup support. Includes a VolumeSnapshotClass for Kubernetes volume snapshots. Web UI protected with OAuth via oauth2-proxy. ServiceMonitor enabled for Prometheus metrics.

Additional manifests: additions/longhorn/

  • volume-snapshot-class.yaml — VolumeSnapshotClass (default, Delete policy)

oauth2-proxy#

Lightweight OAuth authentication proxy. Redirects unauthenticated users to GitHub for login. Integrated with nginx ingress annotations. Resource limits: 10m/64Mi request, 100m/128Mi limit.

Additional manifests: additions/oauth2-proxy/

  • oauth2-proxy-secret.yaml — SealedSecret for GitHub OAuth credentials

See Set Up OAuth Authentication for configuration details.

RKLlama#

NPU-accelerated LLM inference server for Rockchip RK1 nodes. Runs as a DaemonSet on nodes labelled node-type: rk1. Requires privileged access to /dev/rknpu. Models are stored on an NFS PersistentVolume so they are shared across all RK1 nodes and persist outside the cluster. Image pinned to 0.0.4. Startup probe allows up to 5 minutes for model loading.

Helm chart: additions/rkllama/ (local chart, no external registry)

  • templates/configmap.yaml — CPU/NPU governor tuning script and nginx reverse-proxy config

  • templates/daemonset.yaml — Main workload (init + rkllama + nginx sidecar containers)

  • templates/ingress.yaml — Ingress for rkllama.<domain>

  • templates/service.yaml — ClusterIP Service round-robining across DaemonSet pods

  • templates/pv.yaml — NFS PersistentVolume (server/path from values.yaml)

  • templates/pvc.yaml — PersistentVolumeClaim bound to the NFS PV

NFS configuration — edit kubernetes-services/values.yaml (the single source of truth):

rkllama:
  nfs:
    server: 192.168.1.3   # your NAS IP
    path: /bigdisk/LMModels  # your NFS export path

ArgoCD injects these values directly into the rkllama Helm chart. No other file needs changing (see Variables Reference).

llama.cpp (CUDA)#

OpenAI-compatible LLM inference server using llama.cpp with CUDA acceleration. Runs as a single-replica Deployment scheduled exclusively on nodes labelled nvidia.com/gpu.present=true. Requires an NVIDIA GPU node in extra_nodes with nvidia_gpu_node: true in the inventory. Image pinned to server-cuda-b8172. Startup probe allows up to 10 minutes for model loading.

Security context drops all capabilities while retaining GPU access.

Models are stored as GGUF files on an NFS PersistentVolume (a separate subdirectory from RKLLama — the two formats are incompatible). Exposes an OpenAI-compatible /v1 API on port 8080, consumed by Open WebUI.

NFS and model configuration — edit kubernetes-services/values.yaml:

llamacpp:
  nfs:
    server: 192.168.1.3          # your NFS server IP
    path: /bigdisk/LMModels/cuda # separate from rkllama — GGUF files only
  model:
    file: "mistral-7b-instruct-v0.2.Q4_K_M.gguf"
    gpuLayers: 99        # offload all layers to GPU
    contextSize: 8192
    parallel: 4
    memoryLimit: "24Gi"

See llama.cpp CUDA Models for how to download models to the NFS share.

NVIDIA device plugin#

DaemonSet that detects NVIDIA GPUs and advertises nvidia.com/gpu resources to the Kubernetes scheduler. Schedules on nodes with label nvidia.com/gpu.present=true (applied by the k3s Ansible role for nvidia_gpu_node hosts). Once running, it also sets the nvidia.com/gpu.present label and the nvidia.com/gpu allocatable resource on the node.

Requires the NVIDIA container runtime to be configured in k3s’s containerd. The update_packages role writes a config.toml.tmpl that sets the NVIDIA runtime as default and survives k3s-agent restarts.

Open WebUI#

ChatGPT-style web interface for interacting with LLMs. Protected by OAuth via oauth2-proxy. Connects to both:

  • RKLLama (Ollama-compatible API) on the RK1 NPU — via ollamaUrls

  • llama.cpp (OpenAI-compatible API) on an NVIDIA GPU — via openaiBaseApiUrl

Models from both backends appear merged in the model dropdown. Stores chat history and user accounts in a Longhorn-backed 5Gi volume. Resource limits: 100m/256Mi request, 500m/1Gi limit.

Note

Either backend is optional. The service works with just RKLLama (RK1 cluster), just llama.cpp (NVIDIA GPU node), or both simultaneously.

The first user to register becomes the admin. RK1 models appear after being pulled via rkllama-pull (see Download RKLLama Models). CUDA models appear as soon as the GGUF file is present on the NFS share and llamacpp has loaded it (see llama.cpp CUDA Models).

Open Brain MCP#

Standalone MCP (Model Context Protocol) server providing persistent AI memory for Claude.ai and Claude Code. Authenticates via OAuth 2.1 with GitHub as identity provider. Connects directly to the Supabase PostgreSQL database for thought storage and to Supabase Storage (MinIO) for file attachments.

Five tools: capture_thought (text-only), search_thoughts, list_thoughts, thought_stats, get_attachment (base64 file retrieval from MinIO).

A local stdio MCP server (open-brain-cli/) extends this with upload_attachment and download_attachment for Claude Code use.

Additional manifests: additions/open-brain-mcp/

  • templates/open-brain-mcp-secret.yaml — SealedSecret for GitHub OAuth credentials, DB URL, and JWT secret

See Open Brain (AI Memory) for deployment and Connect Claude.ai to Cluster Services via MCP for connecting Claude.ai.

Sealed Secrets#

Bitnami Sealed Secrets controller. Installed in kube-system namespace. Decrypts SealedSecret resources into regular Kubernetes Secrets.

Supabase#

Self-hosted Supabase platform providing PostgreSQL, authentication, REST API (PostgREST), realtime subscriptions, edge functions, storage, and an admin dashboard (Studio). Used as the backend for Open Brain AI memory.

Components: db, auth, rest, realtime, storage, functions, studio, kong, meta, minio (10 pods total). All pods scheduled on x86/amd64 nodes.

MinIO provides S3-compatible object storage for file attachments, backed by a Longhorn PVC. Studio is protected by OAuth via oauth2-proxy.

Additional manifests: additions/supabase/

  • templates/supabase-secret.yaml — SealedSecret for all Supabase credentials (JWT, DB password, API keys, MinIO credentials)

See Open Brain (AI Memory) for deployment.

ArgoCD itself#

ArgoCD is not in the kubernetes-services/templates/ directory — it is installed directly by the Ansible cluster role using the OCI Helm chart (v7.8.3). It is the foundation that all other services depend on.

Access: argocd.<domain> (SSL passthrough) or kubectl port-forward svc/argocd-server -n argo-cd 8080:443.