Migrate ~180 LOC of openssl/kubectl init Jobs to declarative Secret manifests reconciled by mittwald/kubernetes-secret-generator (random strings, SSH keypair) and cert-manager Certificates (RSA private key + self-signed CA chain). mittwald only fills empty fields, so existing populated Secrets keep their current values across the migration. Changes: - New prototype kubernetes-secret-generator (chart 3.4.1, mittwald helm repo). Cluster-wide informer reconciler, no webhook -> cold-bootstrap safe via ArgoCD retries. - New cert-manager selfsigned ClusterIssuer (in-cluster trust root). letsencrypt remains for public-DNS endpoints. - forgejo: admin-secret Job replaced with a mittwald-annotated Secret (hex-encoded 24-char password). Deploy-key Job split: mittwald ssh-keypair Secret + slim Job that uploads pubkey to Forgejo and copies privkey into the argocd repo Secret. - ocis: 13 Secrets / 16 random fields now mittwald-managed (UUIDs replaced with opaque random hex; ocis treats user-id as opaque). IDP RSA signing key, LDAP self-signed CA, and LDAP server cert produced by cert-manager. Per-Deployment ytt overlay remaps volume key paths (tls.crt -> ldap-ca.crt, tls.key -> private-key.pem, etc.) since the ocis chart mounts Secrets raw without items support. Old multi-secret s3-secret-job replaced with a slim external-secret precheck Job that only validates pre-created Hetzner S3/Storage Box credentials. - Application sync-wave -10 on cert-manager and kubernetes-secret- generator so they install before consumers. ArgoCD selfHeal handles any residual races. CLAUDE.md: remove the "all namespaces use privileged PodSecurity" convention. Existing namespaces still carry the label and will be audited separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
85 lines
4.2 KiB
Markdown
85 lines
4.2 KiB
Markdown
# k8s-and-chill
|
|
|
|
## Project Overview
|
|
GitOps-managed Kubernetes cluster on Hetzner Cloud running Talos Linux. Uses [myks](https://github.com/mykso/myks) for Helm chart rendering with ytt overlays, targeting ArgoCD for continuous deployment.
|
|
|
|
## Cluster
|
|
- **3 Talos control-plane nodes** (CAX11 ARM64, Hetzner Cloud Nuremberg)
|
|
- Node IPs: `195.201.219.111`, `195.201.140.75`, `195.201.219.17`
|
|
- `allowSchedulingOnControlPlanes: true` (no dedicated workers)
|
|
|
|
## Domain & DNS
|
|
- **Domain**: `tr1ceracop.de` (registered at INWX)
|
|
- **DNS**: Managed at INWX with wildcard A record `*.tr1ceracop.de` pointing to node IPs
|
|
- **Forgejo**: `https://git.tr1ceracop.de`
|
|
- **ArgoCD**: `https://argocd.tr1ceracop.de`
|
|
|
|
## Deployed Applications
|
|
| App | Namespace | Notes |
|
|
|-----|-----------|-------|
|
|
| traefik | traefik | Ingress controller, DaemonSet with hostPort 80/443 |
|
|
| cert-manager | cert-manager | Let's Encrypt HTTP-01 via ClusterIssuer `letsencrypt` |
|
|
| forgejo | forgejo | Git server, SQLite, local-path PVC |
|
|
| argocd | argocd | GitOps controller |
|
|
| local-path-provisioner | local-path-storage | Default StorageClass, installed via upstream manifest |
|
|
|
|
## myks Structure
|
|
```
|
|
prototypes/ # Application templates (helm values + ytt overlays)
|
|
argocd/
|
|
traefik/
|
|
cert-manager/
|
|
forgejo/
|
|
envs/
|
|
env-data.ytt.yaml # Global ArgoCD config
|
|
_env/ # Shared overlays (annotations, secrets)
|
|
production/
|
|
env-data.ytt.yaml # App list for production
|
|
_apps/{app}/app-data.ytt.yaml # Per-app overrides
|
|
rendered/
|
|
envs/production/{app}/ # kubectl-ready manifests
|
|
argocd/production/ # ArgoCD Application resources
|
|
talos/
|
|
controlplane.yaml # Talos machine config
|
|
talosconfig # Talos client config
|
|
kubeconfig # Cluster kubeconfig
|
|
```
|
|
|
|
### Prototype Pattern
|
|
Each prototype follows this structure:
|
|
- `app-data.ytt.yaml` — namespace declaration
|
|
- `vendir/vendir-data.ytt.yaml` — chart name, version, repository URL
|
|
- `vendir/base.ytt.yaml` — vendir config template (identical across all)
|
|
- `helm/{chart}.yaml` — Helm values overrides
|
|
- `ytt/ns.ytt.yaml` — Namespace resource + namespace overlay on all resources
|
|
|
|
### Key Commands
|
|
```bash
|
|
myks render # Render all apps
|
|
myks render production <app> # Render single app
|
|
kubectl apply -f rendered/envs/production/<app>/ --server-side # Deploy
|
|
```
|
|
|
|
## Kubeconfig & Talos
|
|
- `KUBECONFIG` and `TALOSCONFIG` are already set in the user's shell environment. Do not set them in commands.
|
|
|
|
|
|
## Known Issues / TODOs
|
|
- **Namespace race condition**: First `kubectl apply` of a new app often fails because namespace isn't ready. Re-apply once.
|
|
- **Traefik DaemonSet updates**: Requires `updateStrategy.rollingUpdate.maxSurge: 0` because hostPort conflicts prevent surge.
|
|
- **Forgejo Ingress API version**: Chart renders `extensions/v1beta1`, fixed via `ytt/ingress-fix.ytt.yaml` overlay to `networking.k8s.io/v1`.
|
|
- **ArgoCD**: Fully wired to Forgejo via App of Apps. Root Application in `default` project syncs `rendered/argocd/production/`. Deploy key provisioned automatically by `argocd-deploy-key-init` Job in forgejo namespace.
|
|
|
|
## Container Images
|
|
- **Never use bitnami images.** Use `alpine/k8s` or plain `alpine` for utility Jobs instead.
|
|
|
|
## Secrets
|
|
- **Never commit secrets to git.** This is a public repository.
|
|
- **All secrets must be generated in-cluster** using init Jobs (ArgoCD PreSync hooks) that create secrets if they don't already exist. See `prototypes/ocis/ytt/s3-secret-job.ytt.yaml` for the pattern.
|
|
- **External secrets** (e.g. S3 credentials) that cannot be generated must be created manually in the cluster before deploying. The init Job should validate their existence and fail fast if missing.
|
|
- When adding a new application that uses a Helm chart generating secrets, configure all `secretRefs` to point to pre-created secret names and use an init Job to generate them.
|
|
- Known external secrets (not in git, created manually):
|
|
- `ocis/ocis-s3-credentials` — Hetzner S3 access key and secret key
|
|
- `ocis/ocis-storagebox-credentials` — Hetzner Storage Box host, user, and SSH private key (for S3 backup to Helsinki)
|
|
- `cert-manager/letsencrypt-account-key` — ACME account key (auto-generated by cert-manager)
|