Commit graph

4 commits

Author SHA1 Message Date
Felix Wolf 9112153e8a fix(ocis): resolve large file upload timeouts and enable stale upload cleanup
Increase Traefik readTimeout from 600s to 3600s to prevent connection drops during large uploads, and enable the suspended cleanUpExpiredUploads CronJob so stale TUS sessions are automatically purged.
2026-04-24 20:12:24 +02:00
Felix Wolf 88fa8c4df3 fix(traefik): increase read-timeout to avoid crashing ocis for large uploads
Traefik's default readTimeout of 60s killing the upload connection. The cascade was:

  1. Large upload exceeds 60s → Traefik kills connection
  2. storageusers floods with NetworkTimeoutError
  3. Aborted uploads generate tons of NATS events
  4. NATS gets overwhelmed → no response from stream
  5. Proxy can't resolve user roles → login returns 500
2026-04-12 18:49:02 +02:00
Felix Wolf a92c5d8dc2 feat: Add VictoriaMetrics monitoring stack
Adds victoria-metrics-single, grafana, kube-state-metrics, and
node-exporter to the cluster. Enables metrics endpoints on traefik,
argocd, and cert-manager for scraping. Grafana available at
grafana.tr1ceracop.de with VictoriaMetrics as default datasource.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 00:20:23 +02:00
Felix Wolf 6f717a602f feat: Initial setup of GitOps-managed Kubernetes cluster
Configures `myks` for Helm chart rendering with `ytt` overlays to manage cluster applications.
Defines prototypes and environment-specific configurations for core applications including ArgoCD, Traefik, Cert-Manager, and Forgejo.
Adds comprehensive documentation covering cluster setup, GitOps structure, and development environment.
Integrates `direnv` for environment variable management, `gitignore` for file exclusion, and `sops` for secret encryption.
Includes rendered Kubernetes manifests and ArgoCD application resources for initial deployment.
2026-03-30 18:21:05 +02:00