chore: update workload runtime integration

This commit is contained in:
2026-06-02 09:48:58 +00:00
parent 80a063a167
commit cb6cf58229
11 changed files with 45 additions and 67 deletions

View File

@ -26,14 +26,15 @@ This file is the execution guide for `ocdp-workload-manifests`.
- Runtime source generation may create Secret, ConfigMap, and patch files, then - Runtime source generation may create Secret, ConfigMap, and patch files, then
run Kustomize. run Kustomize.
- Bases should stay template-free YAML. - Bases should stay template-free YAML.
- App bases should keep Services as `ClusterIP`; expose apps from - App bases should keep Services internally reachable; expose apps from
WorkloadTemplate values such as `exposure=nodeport` or `exposure=loadbalancer`. WorkloadClaim top-level intent such as `exposure=internal` or
`exposure=external`, rendered by the agent/template path.
Components such as `service-nodeport`, `service-loadbalancer`, or `ingress` Components such as `service-nodeport`, `service-loadbalancer`, or `ingress`
are implementation building blocks, not the user-facing contract. are implementation building blocks, not the user-facing contract.
- Environment overlays are platform/admin choices for cluster- or site-level - Environment overlays are platform/admin choices for cluster- or site-level
differences such as StorageClass, IngressClass, GPU runtime, registry prefix, differences such as StorageClass, IngressClass, GPU runtime, registry prefix,
pull secrets, node selectors, tolerations, and site labels. pull secrets, node selectors, tolerations, and site labels.
- NodePort numbers belong in per-claim values handled by `ocdp-server`, not in - NodePort numbers should be assigned per claim by Kubernetes, not hard-coded in
reusable app bases or components. reusable app bases or components.
## Validation ## Validation

View File

@ -19,7 +19,7 @@ ocdp-server PostgreSQL
WorkloadTemplate metadata WorkloadTemplate metadata
WorkloadTemplate source.repositoryUrl/ref/path WorkloadTemplate source.repositoryUrl/ref/path
user-facing values schema / parameters user-facing values schema / parameters
environment overlay/profile metadata environment overlay and policy metadata
access bindings access bindings
no WorkloadClaim instance storage no WorkloadClaim instance storage
@ -56,6 +56,8 @@ user-created claim lives in Kubernetes, not in PostgreSQL.
```text ```text
apps/ apps/
earth2studio-demo/
base/
code-server/ code-server/
base/ base/
components/ components/
@ -87,10 +89,10 @@ Secrets, ConfigMaps, components, and patches.
## Exposure ## Exposure
Base services are `ClusterIP`. User-facing exposure choices such as Base services should stay internally reachable. User-facing exposure choices
`clusterip`, `nodeport`, and `loadbalancer` belong in the WorkloadTemplate belong on the WorkloadClaim top-level intent such as `exposure=internal` or
values schema. `ocdp-server` can translate that value into runtime Service `exposure=external`; the agent/template renderer turns that intent into runtime
configuration or a generated patch. Service resources.
Reusable components are still useful implementation building blocks: Reusable components are still useful implementation building blocks:
@ -99,19 +101,20 @@ Reusable components are still useful implementation building blocks:
- `components/service-loadbalancer`: change the app Service to `LoadBalancer`. - `components/service-loadbalancer`: change the app Service to `LoadBalancer`.
- `components/service-nodeport`: change the app Service to `NodePort`. - `components/service-nodeport`: change the app Service to `NodePort`.
When `nodeport` is selected by the user, `ocdp-server` may generate an instance When `exposure=external` is selected by the user, the agent/template renderer
patch or runtime Service field for an explicit port: may generate a NodePort Service and leave the concrete nodePort for Kubernetes
to allocate:
```yaml ```yaml
- op: add - op: add
path: /spec/ports/0/nodePort path: /spec/type
value: 30080 value: NodePort
``` ```
Do not hard-code shared NodePort values in app bases or reusable components. Do not hard-code shared NodePort values in app bases or reusable components.
Environment overlays are different from user choices. They are selected by the Environment overlays are different from user choices. They are selected by the
platform from cluster, workspace, or customer profile information and can carry platform from cluster, workspace, or customer policy information and can carry
things like StorageClass, IngressClass, GPU runtime class, registry prefix, things like StorageClass, IngressClass, GPU runtime class, registry prefix,
pull-secret wiring, node selectors, tolerations, and site-specific labels. pull-secret wiring, node selectors, tolerations, and site-specific labels.

View File

@ -1,7 +1,17 @@
# code-server # code-server
The base deploys code-server with a PVC, Service, and password Secret reference. The base deploys code-server from `harbor.bwgdi.com/library/earth2studio-demo:v6`
with a Service and password Secret reference.
The Secret is generated by the instance overlay. Do not commit real passwords or The Secret is generated by the instance overlay. Workspace storage is mounted
tokens to this catalog. from top-level WorkloadClaim `storage`, and exposure is rendered from top-level
`exposure`. Do not commit real passwords or tokens to this catalog.
The WorkloadTemplate exposes storage intent (`temporary`, `retained`,
`existing`) and exposure intent (`internal`, `external`) as claim-time choices.
Workspace storage defaults to retained. The template also mounts a hidden
retained `weight` StorageClass PVC at `/models` for model weights; this
StorageClass detail is platform-owned and is not exposed as a user parameter.
The Deployment keeps HAMi resource keys in `resources.limits` at all times:
`nvidia.com/gpu` for GPU count and `nvidia.com/gpumem` for GPU memory in MiB.

View File

@ -19,18 +19,23 @@ spec:
app.kubernetes.io/component: ide app.kubernetes.io/component: ide
app.kubernetes.io/part-of: ocdp-workload app.kubernetes.io/part-of: ocdp-workload
spec: spec:
imagePullSecrets:
- name: regcred
schedulerName: hami-scheduler
securityContext: securityContext:
fsGroup: 1000 fsGroup: 1000
containers: containers:
- name: code-server - name: code-server
image: codercom/code-server:latest image: harbor.bwgdi.com/library/earth2studio-demo:v6
imagePullPolicy: IfNotPresent imagePullPolicy: IfNotPresent
command:
- code-server
args: args:
- --bind-addr - --bind-addr
- 0.0.0.0:8080 - 0.0.0.0:8080
- --auth - --auth
- password - password
- /home/coder/project - /workspace
env: env:
- name: PASSWORD - name: PASSWORD
valueFrom: valueFrom:
@ -55,11 +60,5 @@ spec:
limits: limits:
cpu: "2" cpu: "2"
memory: 4Gi memory: 4Gi
volumeMounts: nvidia.com/gpu: "1"
- name: workspace nvidia.com/gpumem: "8192"
mountPath: /home/coder/project
volumes:
- name: workspace
persistentVolumeClaim:
claimName: code-server-data

View File

@ -1,7 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1 apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization kind: Kustomization
resources: resources:
- pvc.yaml
- deployment.yaml - deployment.yaml
- service.yaml - service.yaml

View File

@ -1,15 +0,0 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: code-server-data
labels:
app.kubernetes.io/name: code-server
app.kubernetes.io/component: workspace-storage
app.kubernetes.io/part-of: ocdp-workload
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi

View File

@ -1,8 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
patches:
- path: pvc-rwx-patch.yaml
target:
kind: PersistentVolumeClaim
name: code-server-data

View File

@ -1,4 +0,0 @@
- op: replace
path: /spec/accessModes/0
value: ReadWriteMany

View File

@ -11,10 +11,13 @@
path: /spec/template/spec/containers/0/resources/limits/memory path: /spec/template/spec/containers/0/resources/limits/memory
value: 4Gi value: 4Gi
- op: replace - op: replace
path: /spec/template/spec/containers/0/args/4 path: /spec/template/spec/containers/0/resources/limits/nvidia.com~1gpu
value: /home/coder/workspace value: "1"
- op: replace - op: replace
path: /spec/template/spec/containers/0/volumeMounts/0/mountPath path: /spec/template/spec/containers/0/resources/limits/nvidia.com~1gpumem
value: "8192"
- op: replace
path: /spec/template/spec/containers/0/args/4
value: /home/coder/workspace value: /home/coder/workspace
- op: add - op: add
path: /spec/template/spec/containers/0/env/- path: /spec/template/spec/containers/0/env/-

View File

@ -6,7 +6,6 @@ resources:
- ../../../apps/code-server/base - ../../../apps/code-server/base
- generated-secret.yaml - generated-secret.yaml
components: components:
- ../../../apps/code-server/components/nfs
- ../../../apps/code-server/components/service-nodeport - ../../../apps/code-server/components/service-nodeport
patches: patches:
- path: deployment-patch.yaml - path: deployment-patch.yaml
@ -15,10 +14,6 @@ patches:
version: v1 version: v1
kind: Deployment kind: Deployment
name: code-server name: code-server
- path: pvc-patch.yaml
target:
kind: PersistentVolumeClaim
name: code-server-data
- path: service-nodeport-patch.yaml - path: service-nodeport-patch.yaml
target: target:
kind: Service kind: Service

View File

@ -1,4 +0,0 @@
- op: replace
path: /spec/resources/requests/storage
value: 10Gi