Commit Graph

8 Commits

Author SHA1 Message Date
e3mrah
83ec889f06
feat(platform): add global.imageRegistry to remaining bp-* charts + bp-catalyst-platform (PR 3/3, #560) (#580)
Charts bumped:
- bp-keycloak 1.2.0 -> 1.2.1 (subchart stub; per-component image.registry knobs documented)
- bp-crossplane 1.1.3 -> 1.1.4 (subchart stub)
- bp-crossplane-claims 1.1.0 -> 1.1.1 (global.kubectlImage added; kubectl Job image templated; Hetzner ubuntu-24.04 server images intentionally untouched)
- bp-velero 1.2.0 -> 1.2.1 (subchart stub)
- bp-kyverno 1.0.0 -> 1.0.1 (subchart stub; per-controller image.registry knobs documented)
- bp-trivy 1.0.0 -> 1.0.1 (subchart stub; both operator + scanner image.registry knobs documented)
- bp-grafana 1.0.0 -> 1.0.1 (subchart stub)
- bp-flux 1.1.3 -> 1.1.4 (subchart stub; per-controller image.repository knobs documented)
- bp-catalyst-platform 1.1.13 -> 1.1.14 (global.imageRegistry + images.{catalystApi,catalystUi,marketplaceApi,console,smeTag} added; all 14 Catalyst-authored image refs templated: catalyst-api, catalyst-ui, marketplace-api, console + 10 SME services)

Post-handover per-Sovereign overlays set global.imageRegistry to harbor.<sovereign-fqdn> so every container image pull routes through the Sovereign's own Harbor proxy_cache.

Closes (partial): issue #560 — all 23 bp-* charts now carry global.imageRegistry

Co-authored-by: alierenbaysal <alierenbaysal@openova.io>
2026-05-02 13:21:53 +04:00
e3mrah
a1bd550208
fix(charts): HTTPRoute templates skip-render on missing host (was failing default-values render) (#402)
Blueprint-release for #401 failed because HTTPRoute templates use
{{- fail }} when gateway.host is not set, which trips the chart default-values
render gate in CI. Switched 6 templates from 'fail loud' to 'skip render':

  if .Values.gateway.host  →  emit HTTPRoute
  else                     →  emit nothing

The Gateway API admission already rejects HTTPRoute with empty hostnames,
so the loud-fail wasn't buying anything an operator wouldn't see at apply
time. Default-values render now produces zero HTTPRoute resources, which
is the correct shape for the upstream chart consumers that don't set
the Sovereign-only gateway block.

Files: keycloak, gitea, openbao, grafana, harbor, catalyst-platform.

Verified:
  helm template t products/catalyst/chart/ → 0 HTTPRoutes (clean)
  helm template t products/catalyst/chart/ --set ingress.gateway.enabled=true --set ingress.hosts.console.host=console.test --set ingress.hosts.api.host=api.test → 2 HTTPRoutes

Closes the blueprint-release failure on commit abf01b6f.

Co-authored-by: hatiyildiz <hati@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:23:58 +04:00
e3mrah
abf01b6f21
feat(platform): Gateway API migration audit (#387) (#401)
Migrates every minimal-Sovereign-set blueprint chart from
networking.k8s.io/v1.Ingress to gateway.networking.k8s.io/v1.HTTPRoute,
replacing the legacy Traefik-on-Sovereigns assumption with the canonical
Cilium + Envoy + Gateway API path per ADR-0001 §9.4 and the WBS §2
correction note (#388).

The single per-Sovereign Gateway is added as additional documents in
the existing bootstrap-kit slot clusters/_template/bootstrap-kit/01-cilium.yaml
(NOT a new top-level slot), since Cilium owns the GatewayClass. It
includes:

  - Certificate `sovereign-wildcard-tls` requesting `*.${SOVEREIGN_FQDN}`
    from `letsencrypt-dns01-prod` (cert-manager + #373 webhook)
  - Gateway `cilium-gateway` in `kube-system` with HTTPS (443, TLS
    terminate) + HTTP (80) listeners, allowedRoutes.namespaces.from=All

Per-blueprint HTTPRoute templates (canonical seam: each wrapper chart's
existing `templates/` directory):

  | Blueprint           | Host pattern                    | Backend port |
  |---------------------|---------------------------------|--------------|
  | bp-keycloak         | auth.<sov>                      | 80           |
  | bp-gitea            | git.<sov>                       | 3000         |
  | bp-openbao          | bao.<sov>                       | 8200         |
  | bp-grafana          | grafana.<sov>                   | 80           |
  | bp-harbor           | registry.<sov>                  | 80           |
  | bp-powerdns         | pdns.<sov>/api  (dual-mode)     | 8081         |
  | bp-catalyst-platform| console.<sov>, api.<sov>         | 80, 8080     |

bp-powerdns supports both Ingress (contabo legacy) and HTTPRoute
(Sovereign) simultaneously — the per-Sovereign overlay sets
`api.gateway.enabled=true` while leaving `api.enabled=true`. The
Ingress object is harmless on Cilium clusters with no Traefik. This
preserves contabo's existing pdns.openova.io flow per ADR-0001 §9.4.

bp-harbor flips `expose.type` from `ingress` to `clusterIP` in
platform/harbor/chart/values.yaml so the upstream chart no longer
emits its own Ingress; the HTTPRoute is the sole HTTP exposure.
TLS terminates at the Gateway (wildcard cert) rather than per-host
Certificates inside the chart.

bp-catalyst-platform's `templates/httproute.yaml` is NOT excluded by
.helmignore (unlike templates/ingress.yaml + templates/ingress-console-tls.yaml,
which remain contabo-only legacy demo infra). The contabo path keeps
serving console.openova.io/sovereign via Traefik unchanged.

Bootstrap-kit slot updates (per-Sovereign hostname interpolation):

  - 08-openbao.yaml      → gateway.host: bao.${SOVEREIGN_FQDN}
  - 09-keycloak.yaml     → gateway.host: auth.${SOVEREIGN_FQDN}
  - 10-gitea.yaml        → gateway.host: gitea.${SOVEREIGN_FQDN}
  - 11-powerdns.yaml     → api.host: pdns.${SOVEREIGN_FQDN}, api.gateway.enabled: true
  - 19-harbor.yaml       → gateway.host: registry.${SOVEREIGN_FQDN}
  - 25-grafana.yaml      → gateway.host: grafana.${SOVEREIGN_FQDN}

Server-side dry-run validation against the live Cilium Gateway API
CRDs on contabo: every HTTPRoute and the per-Sovereign Gateway
+ Certificate apply cleanly via `kubectl apply --dry-run=server`.

Contabo unaffected: clusters/contabo-mkt/* not modified. The legacy
SME ingresses (console-nova, marketplace, admin, axon, talentmesh,
stalwart, ...) continue to serve via Traefik as before. powerdns
on contabo remains on the Ingress path (api.gateway.enabled defaults
to false at the chart level).

Closes #387.

Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:19:30 +04:00
e3mrah
75128781b3
feat(platform): observability stack umbrellas (grafana/loki/mimir/tempo/alloy/otel/langfuse/velero) (#214)
* feat(bp-grafana): umbrella chart for observability stack

Catalyst Blueprint umbrella for Grafana — visualization layer of the
LGTM observability stack (Loki/Grafana/Tempo/Mimir).

Pinned to grafana/grafana 10.5.15 (appVersion 12.3.1) — current stable
on 2026-04-29. Solo-Sovereign defaults: 1 replica, 10Gi PVC,
ServiceMonitor disabled per BLUEPRINT-AUTHORING.md §11.2.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-loki): umbrella chart for observability stack

Catalyst Blueprint umbrella for Grafana Loki — log aggregation backend
of the LGTM stack. SingleBinary mode by default (solo-Sovereign min);
SimpleScalable/Distributed are values toggles.

Pinned to grafana/loki 7.0.0 (appVersion 3.6.7) on 2026-04-29.
Filesystem storage default; SeaweedFS S3 wiring is per-Sovereign overlay
when scaling out. All observability toggles default false per
BLUEPRINT-AUTHORING.md §11.2.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-mimir): umbrella chart for observability stack

Catalyst Blueprint umbrella for Grafana Mimir — metrics storage tier of
the LGTM stack.

Pinned to grafana/mimir-distributed 6.0.6 (appVersion 3.0.4) on
2026-04-29. Solo-Sovereign defaults: every component scaled to 1
replica, zoneAwareReplication disabled, Kafka ingest-storage disabled.
Bundled MinIO kept enabled as a stop-gap so the chart renders;
SeaweedFS S3 wiring is per-Sovereign overlay. All metaMonitoring
toggles default false per BLUEPRINT-AUTHORING.md §11.2.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-tempo): umbrella chart for observability stack

Catalyst Blueprint umbrella for Grafana Tempo — distributed tracing
backend of the LGTM stack. Single-binary mode by default
(solo-Sovereign min); microservice mode (tempo-distributed) is a chart
swap toggle.

Pinned to grafana/tempo 1.24.4 (appVersion 2.9.0) on 2026-04-29. Local
PVC storage default; SeaweedFS S3 wiring is per-Sovereign overlay.
Metrics generator disabled by default (depends on bp-mimir).
ServiceMonitor default false per BLUEPRINT-AUTHORING.md §11.2.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-alloy): umbrella chart for observability stack

Catalyst Blueprint umbrella for Grafana Alloy — unified telemetry
collector for the LGTM stack (logs, metrics, traces; OTLP-native).

Pinned to grafana/alloy 1.8.0 (appVersion v1.16.0) on 2026-04-29.
DaemonSet controller default (one Alloy per node) so node + container
telemetry work out of the box. Empty Alloy config by default;
per-Sovereign overlays populate forwarders to bp-loki/bp-mimir/bp-tempo
once those reconcile. ServiceMonitor + ingress + CRDs default false per
BLUEPRINT-AUTHORING.md §11.2.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-opentelemetry): umbrella chart for observability stack

Catalyst Blueprint umbrella for the OpenTelemetry Collector — vendor-
neutral telemetry collector. Sibling to bp-alloy; per-Sovereign overlays
choose one.

Pinned to open-telemetry/opentelemetry-collector 0.152.0 (appVersion
0.150.1) on 2026-04-29. Uses the contrib distribution
(otel/opentelemetry-collector-contrib:0.150.1) so Loki/Mimir/Tempo
exporters are bundled. Deployment mode default (1 replica); DaemonSet
+ StatefulSet are values toggles. All presets default false; ingress
+ ServiceMonitor + PodMonitor + PrometheusRule + NetworkPolicy default
false per BLUEPRINT-AUTHORING.md §11.2.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-langfuse): umbrella chart for observability stack

Catalyst Blueprint umbrella for Langfuse — LLM observability platform.
Complements bp-grafana (infrastructure metrics) with AI-specific
telemetry (traces, evaluations, prompts, cost attribution).

Pinned to langfuse/langfuse 1.5.28 (appVersion 3.171.0) on 2026-04-29.

Catalyst convention: ALL bundled Bitnami subcharts are disabled —
PostgreSQL via cnpg.io/Cluster (bp-cnpg), Redis via bp-valkey,
ClickHouse via bp-clickhouse, S3 via bp-seaweedfs. Per-Sovereign
overlays wire external endpoints + Secret references. Telemetry to
Langfuse Inc. defaulted false; signUpDisabled defaulted true.

Part of issue #204 observability-stack umbrellas batch.

* feat(bp-velero): umbrella chart for observability stack

Catalyst Blueprint umbrella for Velero — Kubernetes-native backup and
disaster recovery. Per platform/velero/README.md, ALL Velero output
goes to SeaweedFS (Catalyst's unified S3 encapsulation), which
transitions to a cloud archival backend on the cold tier.

Pinned to vmware-tanzu/velero 12.0.1 (appVersion 1.18.0) on 2026-04-29.
Bundled velero-plugin-for-aws:v1.14.0 init container so SeaweedFS S3 is
reachable. backupsEnabled/snapshotsEnabled defaulted false at this
layer (placeholders for backupStorageLocation); per-Sovereign overlays
flip on after wiring SeaweedFS endpoint + credentials. ServiceMonitor +
PodMonitor + PrometheusRule default false per BLUEPRINT-AUTHORING.md
§11.2.

Part of issue #204 observability-stack umbrellas batch.

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-04-29 22:11:04 +02:00
hatiyildiz
7cafa3c894 docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay
Component-level architectural correction (two changes):

1. MinIO → SeaweedFS as unified S3 encapsulation layer

The old design used MinIO for in-cluster S3 plus separate cold-tier configuration scattered across consumers. The new design positions SeaweedFS as the single S3 encapsulation layer: every Catalyst component talks to one endpoint (seaweedfs.storage.svc:8333). SeaweedFS internally handles hot tier (in-cluster NVMe), warm tier (in-cluster bulk), and cold tier (transparent passthrough to cloud archival storage — Cloudflare R2 / AWS S3 / Hetzner Object Storage / etc., chosen at Sovereign provisioning). One audit/lifecycle/encryption boundary instead of N. No Catalyst component talks to cloud S3 directly anymore — Velero, CNPG WAL archive, OpenSearch snapshots, Loki/Mimir/Tempo, Iceberg, Harbor blob store, Application buckets all share one S3 surface.

2. Apache Guacamole added as Application Blueprint §4.5 Communication

Clientless browser-based RDP/VNC/SSH/kubectl-exec gateway. Keycloak SSO, full session recording to SeaweedFS for compliance evidence (PSD2/DORA/SOX). Composed into bp-relay. Replaces VPN+native-client distribution for auditable remote access.

Component changes:
- DELETED: platform/minio/
- CREATED: platform/seaweedfs/README.md (unified S3 + cold-tier encapsulation; bucket layout; multi-region replication via shared cold backend; migration-from-MinIO section)
- CREATED: platform/guacamole/README.md (clientless remote-desktop gateway; GuacamoleConnection CRD; compliance integration via session recordings)

Doc updates: PLATFORM-TECH-STACK §1+§3.5+§4.5+§5+§7.4; TECHNOLOGY-FORECAST L11+mandatory+a-la-carte counts (52 → 53); ARCHITECTURE §3 topology; SECURITY §4 DB engines; SOVEREIGN-PROVISIONING §1 inputs; SRE §2.5+§7; IMPLEMENTATION-STATUS §3; BLUEPRINT-AUTHORING stateful examples; BUSINESS-STRATEGY 13 component-count anchors + Relay product line; README.md backup row; CLAUDE.md folder count.

Component README updates (S3 endpoint + dependency renames): cnpg, clickhouse, flink, gitea, iceberg, harbor, grafana, livekit, kserve, milvus, opensearch, flux, stalwart, velero (substantive rewrite of velero — now writes exclusively to SeaweedFS with cold-tier auto-routing). Products: relay, fabric.

UI scaffold: products/catalyst/bootstrap/ui/src/shared/constants/components.ts — minio entry replaced with seaweedfs; velero+harbor deps updated; new guacamole entry added.

VALIDATION-LOG entry "Pass 104 — MinIO → SeaweedFS swap + Guacamole add" captures the encapsulation principle and adds Lesson #22: storage tier policy belongs at the encapsulation boundary, not inside every consumer.

Verification: zero remaining MinIO references in canonical docs (one intentional retention in TECHNOLOGY-FORECAST L37 explaining the swap); 53 platform/ folders matching all "53 components" anchors; bp-relay composition includes guacamole.
2026-04-28 10:23:46 +02:00
hatiyildiz
ea81c38e15 docs(pass-9): role-in-Catalyst banners on grafana / harbor / falco / kyverno / sigstore / syft-grype
Pass 9 — six more component READMEs got Catalyst-role banners
matching the rule of thumb in CLAUDE.md (every platform/<x>/README.md
should state its role in Catalyst).

- grafana: observability stack on every host cluster; Catalyst's
  own self-monitoring + Application telemetry flows here.
- harbor: per-host-cluster container registry for Catalyst images,
  mirrored Blueprint OCI artifacts, customer images.
- falco: runtime security on every host cluster; feeds SIEM/SOAR.
- kyverno: policy engine on every host cluster; enforces Catalyst
  policy contracts (cosign on Blueprints, default-deny NetworkPolicies
  on Organization namespaces, priority-class injection).
- sigstore: cosign-signed Blueprint OCI artifacts + admission
  verification chain on every host cluster.
- syft-grype: SBOM generation in CI per Blueprint + runtime CVE scans.

Plus Kyverno priority-class clarification: prose around `tenant-high`
/ `tenant-default` / `tenant-batch` priority class names now reads
"Organization workloads" instead of "tenant workloads", with an
explicit note that the priority class artifact names themselves stay
as-is until a separate migration ticket renames them in deployed
clusters (renaming PriorityClass objects requires recreate, not
in-place rename).

VALIDATION-LOG: Pass 9 entry added.

Refs #37
2026-04-27 21:40:51 +02:00
hatiyildiz
119a1e53a0 docs(components): terminology pass across platform and product READMEs
Bring per-component READMEs in line with the canonical glossary
(docs/GLOSSARY.md). Substantive architectural content unchanged —
this is a terminology + reference correctness pass.

Placeholder rename: <tenant> → <org> in YAML / IaC examples across
- platform/cnpg/README.md           (Cluster + Pooler + ScheduledBackup)
- platform/debezium/README.md       (PostgreSQL connector + topic patterns)
- platform/external-secrets/README.md (ExternalSecret / SecretStore)
- platform/grafana/README.md        (Instrumentation namespace)
- platform/k8gb/README.md           (Gslb + namespace + kubectl examples)
- platform/keda/README.md           (ScaledObject + Kafka triggers + Prometheus)
- platform/opentofu/README.md       (server resource example)
- platform/velero/README.md         (BackupStorageLocation buckets)
- platform/vpa/README.md            (VerticalPodAutoscaler examples)
- platform/flux/README.md           (kustomization name + tenants/ → organizations/)

"Catalyst IDP" → "Catalyst console":
- platform/crossplane/README.md     (integration section retitled and
                                      rewritten — Crossplane is platform
                                      plumbing, not user-facing)
- platform/gitea/README.md          (architecture diagram + integration table)
- platform/kyverno/README.md        (rollout tracking surface)
- products/fingate/README.md        (TPP onboarding portal)

"Bootstrap wizard" → "Catalyst bootstrap":
- platform/openbao/README.md        (bootstrap procedure rewritten —
                                      independent Raft per region clarified;
                                      cross-references docs/SECURITY.md §5)
- platform/opentofu/README.md       (Quick Start)

Kyverno labels & prose:
- openova.io/tenant → openova.io/organization (label rename for
  consistency; deployed clusters will add new label as a co-label
  during migration window)
- "tenant labels" / "tenant namespace" prose updated to
  "Organization labels" / "Organization-labeled namespace"
- Priority class names (tenant-high, tenant-default, tenant-batch)
  retained as deployed artifact names — rename pending in a
  separate migration ticket

No banned-term hits remain in component READMEs (verified by grep
in docs/GLOSSARY.md banned-terms table).

Refs #37
2026-04-27 20:06:51 +02:00
talent-mesh
c9d04a53b4 refactor: flatten platform/ structure (41 components)
Remove hierarchical grouping (networking/, security/, etc.) and use flat
structure for all 41 platform components.

Changes:
- All components now directly under platform/ (no subfolders)
- AI Hub components moved from meta-platforms/ai-hub/components/ to platform/
- Open Banking components (lago, openmeter) moved to platform/
- meta-platforms/ now only contains README files that reference platform/
- Open Banking custom services remain in meta-platforms/open-banking/services/

Structure:
- platform/ (41 components, flat)
- meta-platforms/ai-hub/ (README only, references platform/)
- meta-platforms/open-banking/ (README + 6 custom services)

All documentation links updated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 15:19:48 +00:00