Edge + serverless + model-serving batch (W2.5.C) — three upstream- subchart umbrella Blueprints completing the bootstrap-kit slots for WebRTC media relay (bp-relay → bp-stunner) and the AI/ML serving stack (bp-cortex → bp-kserve → bp-knative). Each chart follows the canonical umbrella pattern from docs/BLUEPRINT-AUTHORING.md §11.1: Chart.yaml declares the upstream chart under `dependencies:` so `helm dependency build` bundles the upstream payload into the OCI artifact, and Catalyst-curated overlay values + templates sit alongside in chart/values.yaml + chart/templates/. Per-chart highlights: - bp-stunner/1.0.0 — wraps stunner/stunner-gateway-operator 1.1.0. Ships a Cilium-native GatewayClass (Capabilities-gated on gateway.networking.k8s.io/v1) so bp-relay (LiveKit / SFU) can claim Gateway CRs without an operator-ordering dance. Default UDP TURN port range 30000-32767 matches the range opened at the Sovereign edge firewall (Crossplane bp-firewall composition). - bp-knative/1.0.0 — wraps knative-operator v1.21.1. Ships a KnativeServing CR pre-configured for **istio-less mode** (ingress.istio.enabled=false, ingress.contour.enabled=false, ingress.kourier.enabled=false; config.network.ingress-class=cilium). Sovereign FQDN sourced from values, no hardcoded fallback per inviolable principle #4 — render fails loudly if cluster overlay doesn't set knativeOverlay.knativeServing.sovereignFqdn. - bp-kserve/1.0.0 — wraps kserve/kserve v0.16.0 (latest version published on the official OCI registry as of 2026-04-30). Default deploymentMode=RawDeployment (no Knative hop on the hot path) but bp-knative is still installed (declared as a hard dep) so per-IS annotation `serving.kserve.io/deploymentMode: Serverless` opts in to scale-to-zero per tenant. Cilium native Gateway-API ingress (enableGatewayApi=true, className=cilium, disableIstioVirtualHost= true). Observability discipline (issue #182): every observability toggle (ServiceMonitor, HPA, GatewayClass) defaults false and is operator- tunable via per-cluster overlay once bp-kube-prometheus-stack reconciles. Each chart ships tests/observability-toggle.sh covering default-off, opt-in (with `--api-versions monitoring.coreos.com/v1` to simulate Prometheus Operator CRDs), and explicit-off cases. Per-chart kind summary (helm template default render): bp-stunner: ClusterRole, ClusterRoleBinding, ConfigMap, Dataplane, Deployment, Role, RoleBinding, Service, ServiceAccount. (+ GatewayClass when --api-versions gateway.networking.k8s.io/v1 is passed.) bp-knative: ClusterRole, ClusterRoleBinding, ConfigMap, CustomResourceDefinition, Deployment, KnativeServing, Role, RoleBinding, Secret, Service, ServiceAccount. bp-kserve: Certificate, ClusterRole, ClusterRoleBinding, ClusterServingRuntime, ClusterStorageContainer, ConfigMap, Deployment, Gateway, Issuer, MutatingWebhookConfiguration, Role, RoleBinding, Service, ServiceAccount, ValidatingWebhookConfiguration. `helm lint` clean for all three (single INFO on missing icon — icons land with marketplace card work). `bash tests/observability-toggle.sh` green for all three (3 cases each: default-off, opt-in, explicit-off). Closes #263 #264 #265 Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
78 lines
2.7 KiB
YAML
78 lines
2.7 KiB
YAML
apiVersion: catalyst.openova.io/v1alpha1
|
|
kind: Blueprint
|
|
metadata:
|
|
name: bp-kserve
|
|
labels:
|
|
catalyst.openova.io/section: pts-4-6-ai-ml
|
|
spec:
|
|
version: 1.0.0
|
|
card:
|
|
title: KServe
|
|
summary: |
|
|
Kubernetes-native model serving — InferenceService CRD, multi-
|
|
framework predictors (vLLM, TorchServe, Triton, SKLearn), inference
|
|
graphs. Wraps the upstream `kserve/kserve` chart and ships a
|
|
Catalyst-curated `RawDeployment` mode (no per-InferenceService
|
|
Knative Service hop by default — KServe writes Deployments
|
|
directly), with **Knative still installed** (via bp-knative) for
|
|
Application-tier Blueprints that opt in to scale-to-zero on a per-
|
|
InferenceService basis. Used by bp-cortex (composite AI Hub
|
|
Blueprint).
|
|
icon: kserve.svg
|
|
category: ai-ml
|
|
visibility: unlisted # bootstrap-kit infrastructure component
|
|
configSchema:
|
|
type: object
|
|
properties:
|
|
deploymentMode:
|
|
type: string
|
|
enum: [RawDeployment, Serverless]
|
|
default: RawDeployment
|
|
description: |
|
|
KServe deployment mode. RawDeployment writes plain
|
|
Deployment+Service+HPA per InferenceService (no Knative hop).
|
|
Serverless writes a Knative Service per InferenceService
|
|
(scale-to-zero). Catalyst defaults to RawDeployment but bp-
|
|
knative is still installed so per-InferenceService overrides
|
|
via annotation can opt in to Serverless without infra changes.
|
|
ingressClass:
|
|
type: string
|
|
default: cilium
|
|
description: |
|
|
KServe ingress class — Catalyst defaults to Cilium native
|
|
Gateway-API (istio-less).
|
|
controllerReplicas:
|
|
type: integer
|
|
default: 1
|
|
minimum: 1
|
|
maximum: 5
|
|
description: |
|
|
KServe controller-manager replicas. Solo Sovereign = 1;
|
|
multi-AZ overlays bump for HA.
|
|
crds:
|
|
type: object
|
|
properties:
|
|
create:
|
|
type: boolean
|
|
default: true
|
|
description: |
|
|
Install serving.kserve.io CRDs as part of this chart.
|
|
CRDs ship with the upstream chart; consumers gated via
|
|
Flux dependsOn.
|
|
placementSchema:
|
|
modes: [single-region, active-active]
|
|
default: single-region
|
|
minRegions: 1
|
|
maxRegions: 5
|
|
manifests:
|
|
chart: ./chart
|
|
depends:
|
|
- blueprint: bp-cilium # Cilium native Gateway-API ingress
|
|
version: ^1
|
|
- blueprint: bp-cert-manager # KServe webhook TLS
|
|
version: ^1
|
|
- blueprint: bp-knative # Knative still installed for per-IS Serverless opt-in
|
|
version: ^1
|
|
upgrades:
|
|
from: ["0.x"]
|