fix(sovereign-tls): cilium-gateway propagates Hetzner LB annotations via spec.infrastructure

Closes #1885 (TBD-A31).

Problem (t28 evidence — A98 + A107 reports, 2026-05-19 00:30Z):
`console.t28.omani.works:443` accepts TCP but TLS resets. Inspection:
`kubectl get svc -n kube-system cilium-gateway-cilium-gateway` shows
type=ClusterIP with no Hetzner LB. Even with the tofu-provisioned
`hcloud_load_balancer.main` (infra/hetzner/main.tf:955) carrying
443→30443 service-port at the infra layer, the cluster-side hcloud-CCM
has no signal to materialise a parallel Service-level LB for the
auto-generated gateway Service — so operators inspecting kubectl see
a non-LoadBalancer Service and conclude the LB chain is broken.

Fix:
Add `spec.infrastructure.annotations` to the Gateway resource. The
Gateway-API spec mandates that controllers propagate these annotations
to any infrastructure resources they create — in Cilium 1.16+ this means
the auto-generated `cilium-gateway-cilium-gateway` Service in kube-system.
hcloud-cloud-controller-manager (bp-hcloud-ccm slot 55) then picks the
annotations up at Service reconcile time and provisions a Hetzner LB.

Annotations (mirrors clustermesh-apiserver block in 01-cilium.yaml):
  - load-balancer.hetzner.cloud/name = <slug>-<region>-gateway
  - load-balancer.hetzner.cloud/location = <Hetzner DC>
  - load-balancer.hetzner.cloud/type = lb11
  - load-balancer.hetzner.cloud/use-private-ip = "false"  (DoD A2 — public IPs always)
  - load-balancer.hetzner.cloud/disable-private-ingress = "true"
  - load-balancer.hetzner.cloud/health-check-protocol = tcp
  - load-balancer.hetzner.cloud/health-check-port = "30443"
  - load-balancer.hetzner.cloud/health-check-interval = 15s
  - load-balancer.hetzner.cloud/health-check-timeout = 10s
  - load-balancer.hetzner.cloud/health-check-retries = "3"

Per-region segmentation: SOVEREIGN_FQDN_SLUG + SOVEREIGN_REGION_KEY in
the LB name so each multi-region peer's cilium-gateway gets its own
public LB (Hetzner LBs are unique-by-name; duplicate-name allocations
collapse to the first-created instance, hiding the LB for every
subsequent region).

Wiring: 3 substitute vars (SOVEREIGN_FQDN_SLUG, SOVEREIGN_REGION_KEY,
HCLOUD_LB_LOCATION) threaded into the sovereign-tls Kustomization's
postBuild.substitute block. These mirror the same vars already passed
to bootstrap-kit's Kustomization for the clustermesh-apiserver LB block
in 01-cilium.yaml apiserver.service.annotations, so the configuration
boundary is symmetric across the gateway LB and the clustermesh LB.

Memory rules respected:
  - A2 (PUBLIC IPs for inter-region) — use-private-ip=false
  - feedback_overlap_provs_dont_serialize_wait (no provisioning gate)
  - feedback_subagents_inherit_design_system (no new architectural seam,
    reuses existing Gateway-API + hcloud-CCM contracts)

Validation:
  $ kubectl kustomize clusters/_template/sovereign-tls/ | grep -A 30 'kind: Gateway'
  → renders all 10 Hetzner LB annotations under spec.infrastructure
  → ${SOVEREIGN_FQDN_SLUG}/${SOVEREIGN_REGION_KEY}/${HCLOUD_LB_LOCATION}
    substituted at Flux apply time

Acceptance criteria (per issue):
  - kubectl get svc -n kube-system cilium-gateway-cilium-gateway shows
    type=LoadBalancer with external IP (after fresh prov + handover)
  - curl -skI https://console.<fqdn>/ returns HTTP 200

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
hatiyildiz 2026-05-19 02:49:11 +02:00
parent ab6f3e6510
commit 16f7284116
2 changed files with 77 additions and 0 deletions

View File

@ -88,6 +88,61 @@ metadata:
catalyst.openova.io/component: cilium-gateway
spec:
gatewayClassName: cilium
# ── TBD-A31 (#1885): Hetzner LB annotations for the gateway Service ──
#
# The Gateway-API spec (`spec.infrastructure.annotations`) is the canonical
# mechanism for declaring annotations that the controller MUST propagate
# to any infrastructure resources it creates in response to this Gateway —
# in Cilium's case, the auto-generated `cilium-gateway-cilium-gateway`
# Service in kube-system. Cilium 1.16+ honours this block and forwards
# the annotations onto the Service `metadata.annotations`, where
# hcloud-cloud-controller-manager (bp-hcloud-ccm slot 55) picks them up
# at Service reconcile time and provisions a Hetzner LB.
#
# Why this matters operationally:
# - A98+A107 evidence on t28 (76fdffb42532e6cc): the gateway Service
# showed `type=ClusterIP` with no Hetzner LB attached → public TLS
# to console.t28.omani.works:443 reset at the handshake. Even with
# the tofu-provisioned `hcloud_load_balancer.main` (infra/hetzner/
# main.tf:955) carrying 443→30443 service-port, operators inspecting
# `kubectl get svc -n kube-system cilium-gateway-cilium-gateway`
# saw a non-LoadBalancer Service and concluded the LB chain was
# broken. Without these annotations, hcloud-CCM has no signal to
# materialise a parallel Service-level LB (the tofu LB at the
# infra layer is invisible to the cluster-side CCM).
# - For multi-region Sovereigns the per-region cilium-gateway in each
# secondary cluster ALSO needs a public LB so external clients can
# reach region-local listeners directly (the omani.homes / omani.rest
# SME-pool subdomains attach to the secondary region's gateway).
# `${SOVEREIGN_REGION_KEY:=primary}` segments the LB name per region
# (mirrors the clustermesh-apiserver LB naming in
# clusters/_template/bootstrap-kit/01-cilium.yaml:237).
#
# use-private-ip: "false" — per docs/SOVEREIGN-MULTI-REGION-DOD.md A2
# (inter-region link = PUBLIC IPs ALWAYS) AND the empirical lesson from
# PR #1538: the Hetzner per-region LB has no private-network attachment
# by default so CCM rejects `use private ip: missing network id`. The
# firewall already opens 30000-32767/tcp (infra/hetzner/main.tf:233) so
# the public-IP LB health checks pass against node:30443.
#
# health-check pinned to TCP:30443 — without this annotation, hcloud-CCM
# defaults the health check to the Service's nodePort (which Cilium
# allocates randomly when hostNetwork=true). Pinning to 30443 (the
# actual host-bound cilium-envoy HTTPS listener) ensures the CCM LB
# marks targets healthy AS SOON AS envoy is listening — without this,
# the LB stayed `unhealthy` indefinitely on prov #76 (2026-05-14).
infrastructure:
annotations:
load-balancer.hetzner.cloud/name: "${SOVEREIGN_FQDN_SLUG:=catalyst}-${SOVEREIGN_REGION_KEY:=primary}-gateway"
load-balancer.hetzner.cloud/location: "${HCLOUD_LB_LOCATION}"
load-balancer.hetzner.cloud/type: "lb11"
load-balancer.hetzner.cloud/use-private-ip: "false"
load-balancer.hetzner.cloud/disable-private-ingress: "true"
load-balancer.hetzner.cloud/health-check-protocol: "tcp"
load-balancer.hetzner.cloud/health-check-port: "30443"
load-balancer.hetzner.cloud/health-check-interval: "15s"
load-balancer.hetzner.cloud/health-check-timeout: "10s"
load-balancer.hetzner.cloud/health-check-retries: "3"
# NOTE: ports 30080/30443 (not 80/443) — even with hostNetwork=true,
# cilium-envoy refuses to bind privileged ports because cilium-agent
# gates that bind through its `envoy-keep-cap-netbindservice` flag and

View File

@ -1256,6 +1256,28 @@ write_files:
# (no 5/168h limit); default → PROD. Locals in main.tf
# render the final string so this template stays declarative.
WILDCARD_CERT_ISSUER: "${wildcard_cert_issuer}"
# TBD-A31 (#1885) — Hetzner LB annotations on cilium-gateway
# Gateway resource (spec.infrastructure.annotations). These
# substitute vars name and locate the LB hcloud-CCM provisions
# for the auto-generated `cilium-gateway-cilium-gateway`
# Service in kube-system. Mirrors the same 3 vars threaded
# into the bootstrap-kit Kustomization for the clustermesh-
# apiserver LB (see 01-cilium.yaml apiserver.service.annotations).
# - SOVEREIGN_FQDN_SLUG: short, DNS-safe Sovereign identifier
# used as the LB name prefix so operators can spot the
# gateway LB in the Hetzner Console.
# - SOVEREIGN_REGION_KEY: per-region suffix so each
# multi-region peer's cilium-gateway gets its own LB
# (Hetzner LBs are unique by name — duplicates collapse to
# the first-created instance, hiding the LB for every
# subsequent region).
# - HCLOUD_LB_LOCATION: Hetzner datacenter location for the
# LB. Per-region rendered (primary CP renders var.region,
# secondary CPs render each.value.cloudRegion) so the LB
# and its backend node are co-located.
SOVEREIGN_FQDN_SLUG: "${sovereign_fqdn_slug}"
SOVEREIGN_REGION_KEY: ${sovereign_region_key}
HCLOUD_LB_LOCATION: "${region}"
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization