Commit graph

3 commits

Author SHA1 Message Date
Felix Wolf 88fa8c4df3 fix(traefik): increase read-timeout to avoid crashing ocis for large uploads
Traefik's default readTimeout of 60s killing the upload connection. The cascade was:

  1. Large upload exceeds 60s → Traefik kills connection
  2. storageusers floods with NetworkTimeoutError
  3. Aborted uploads generate tons of NATS events
  4. NATS gets overwhelmed → no response from stream
  5. Proxy can't resolve user roles → login returns 500
2026-04-12 18:49:02 +02:00
Felix Wolf a92c5d8dc2 feat: Add VictoriaMetrics monitoring stack
Adds victoria-metrics-single, grafana, kube-state-metrics, and
node-exporter to the cluster. Enables metrics endpoints on traefik,
argocd, and cert-manager for scraping. Grafana available at
grafana.tr1ceracop.de with VictoriaMetrics as default datasource.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 00:20:23 +02:00
Felix Wolf 6f717a602f feat: Initial setup of GitOps-managed Kubernetes cluster
Configures `myks` for Helm chart rendering with `ytt` overlays to manage cluster applications.
Defines prototypes and environment-specific configurations for core applications including ArgoCD, Traefik, Cert-Manager, and Forgejo.
Adds comprehensive documentation covering cluster setup, GitOps structure, and development environment.
Integrates `direnv` for environment variable management, `gitignore` for file exclusion, and `sops` for secret encryption.
Includes rendered Kubernetes manifests and ArgoCD application resources for initial deployment.
2026-03-30 18:21:05 +02:00