Add code-server package manifests

This commit is contained in:
2026-06-03 01:08:06 +00:00
parent cb6cf58229
commit d85dcb5c11
9 changed files with 301 additions and 59 deletions

View File

@ -11,6 +11,7 @@ This file is the execution guide for `ocdp-workload-manifests`.
CRs created through `ocdp-server`. CRs created through `ocdp-server`.
- Keep reusable Kubernetes resources under `apps/*/base`. - Keep reusable Kubernetes resources under `apps/*/base`.
- Keep optional Kustomize components under `apps/*/components`. - Keep optional Kustomize components under `apps/*/components`.
- Keep Helm post-render presets under `packages/*/presets/*`.
- Keep per-deployment runtime values out of this repository. They belong in - Keep per-deployment runtime values out of this repository. They belong in
temporary source files or runtime specs generated by `ocdp-server`. temporary source files or runtime specs generated by `ocdp-server`.
- Use `tests/kustomize/*` only for generic validation overlays, never for real - Use `tests/kustomize/*` only for generic validation overlays, never for real
@ -21,10 +22,10 @@ This file is the execution guide for `ocdp-workload-manifests`.
## Runtime Rules ## Runtime Rules
- Do not add a global catalog index unless the server explicitly needs one later. - Do not add a global catalog index unless the server explicitly needs one later.
- `ocdp-server` WorkloadTemplate records should reference apps with - `ocdp-server` WorkloadTemplate records may reference Helm charts plus
`repositoryUrl`, `ref`, and `path`. `source.preset.repositoryUrl/ref/path`.
- Runtime source generation may create Secret, ConfigMap, and patch files, then - Runtime source generation may render Helm, create Secret, ConfigMap, and patch
run Kustomize. files, then run Kustomize.
- Bases should stay template-free YAML. - Bases should stay template-free YAML.
- App bases should keep Services internally reachable; expose apps from - App bases should keep Services internally reachable; expose apps from
WorkloadClaim top-level intent such as `exposure=internal` or WorkloadClaim top-level intent such as `exposure=internal` or

107
README.md
View File

@ -1,10 +1,11 @@
# ocdp-workload-manifests # ocdp-workload-manifests
Standalone Kubernetes manifests for OCDP workloads. Standalone Kubernetes manifests and post-render presets for OCDP workloads.
This repository is intentionally just a Git repository of app manifests. There is This repository is intentionally just a Git repository of app manifests. There is
no global catalog index and no dependency on the Gitea API. `ocdp-server` can no global catalog index and no dependency on the Gitea API. `ocdp-server` stores
read a workload by building a Kustomize target directly from Git. the template contract; the operator renders Helm sources and then applies
Kustomize presets from this repository.
## Architecture ## Architecture
@ -14,12 +15,15 @@ OCDP keeps the responsibilities split:
Git repo Git repo
apps/<app>/base apps/<app>/base
apps/<app>/components apps/<app>/components
packages/<app>/presets/<preset>
kustomization.yaml
userInputs.yaml
ocdp-server PostgreSQL ocdp-server PostgreSQL
WorkloadTemplate metadata WorkloadTemplate metadata
WorkloadTemplate source.repositoryUrl/ref/path WorkloadTemplate Helm source
WorkloadTemplate source.preset.repositoryUrl/ref/path
user-facing values schema / parameters user-facing values schema / parameters
environment overlay and policy metadata
access bindings access bindings
no WorkloadClaim instance storage no WorkloadClaim instance storage
@ -29,35 +33,50 @@ target cluster
runtime Kubernetes resources runtime Kubernetes resources
``` ```
This repository stores only the Git base and reusable components. It does not This repository stores Git bases, reusable components, and Kustomize presets.
store `WorkloadTemplate` records and does not store user `WorkloadClaim` For Helm-backed workloads, the operator runs `helm template` first, writes that
instances. output as `rendered.yaml`, then renders the selected preset with Workload
`spec.values`. A preset may include `userInputs.yaml` so admins can see which
values should become the WorkloadTemplate user-facing parameter schema. This
metadata file is not a Kubernetes resource and is not referenced by Kustomize.
An admin creates or updates a `WorkloadTemplate` in `ocdp-server`. That template An admin creates or updates a `WorkloadTemplate` in `ocdp-server`. That template
can point at one of these Git paths: can point at a Helm chart plus one of these Git presets:
```yaml ```yaml
templateType: kustomize templateType: kustomize
source: source:
type: gitKustomize type: helm
repositoryUrl: https://gitea.example.com/ocdp/ocdp-workload-manifests.git repositoryUrl: https://kuoss.github.io/helm-charts
ref: code-server-v0.1.0 chart: code-server
path: apps/code-server/base version: 3.16.1
releaseName: "{{ name }}"
values: |
fullnameOverride: "{{ name }}"
serviceAccount:
create: false
persistence:
enabled: true
size: 20Gi
preset:
type: kustomize
repositoryUrl: https://gitea.example.com/ocdp/ocdp-workload-manifests.git
ref: main
path: packages/code-server/presets/k3s-hami
``` ```
After the template is stored in PostgreSQL and assigned to users or groups, users After the template is stored in PostgreSQL and assigned to users or groups, users
call `ocdp-server` to create a claim by submitting `templateId`, `workspaceId`, call `ocdp-server` to create a claim by submitting `templateId`, `workspaceId`,
and values. `ocdp-server` resolves the template from PostgreSQL, reads the Git and values. `ocdp-server` resolves the template from PostgreSQL and writes a
base, generates any temporary source files or runtime specs outside this repo, `WorkloadClaim` CR into the target Kubernetes cluster. The operator renders Helm
then writes a `WorkloadClaim` CR into the target Kubernetes cluster. The final with the resolved values, applies the Kustomize preset, and then applies the
user-created claim lives in Kubernetes, not in PostgreSQL. final Kubernetes resources. The final user-created claim lives in Kubernetes,
not in PostgreSQL.
## Layout ## Layout
```text ```text
apps/ apps/
earth2studio-demo/
base/
code-server/ code-server/
base/ base/
components/ components/
@ -67,32 +86,46 @@ apps/
litellm/ litellm/
base/ base/
components/ components/
packages/
code-server/
presets/
k3s-hami/
kustomization.yaml
userInputs.yaml
tests/ tests/
kustomize/ kustomize/
``` ```
## Server Usage ## Server Usage
`ocdp-server` should store a normal `gitKustomize` reference: `ocdp-server` should store a Helm source plus a Kustomize preset reference:
```yaml ```yaml
source: source:
type: gitKustomize type: helm
repositoryUrl: https://gitea.example.com/ocdp/ocdp-workload-manifests.git repositoryUrl: https://kuoss.github.io/helm-charts
ref: code-server-v0.1.0 chart: code-server
path: apps/code-server/base version: 3.16.1
releaseName: "{{ name }}"
preset:
type: kustomize
repositoryUrl: https://gitea.example.com/ocdp/ocdp-workload-manifests.git
ref: main
path: packages/code-server/presets/k3s-hami
``` ```
For one deployment, `ocdp-server` should generate temporary source files outside For one deployment, `ocdp-server` stores the Helm chart reference, static Helm
this repository. Those files can point at the Git base and add generated values, preset reference, and the template value contract. The preset renders
Secrets, ConfigMaps, components, and patches. with Workload `spec.values`, so environment-specific implementation details
should live in the selected preset instead of in the WorkloadTemplate create
form.
## Exposure ## Exposure
Base services should stay internally reachable. User-facing exposure choices Base services should stay internally reachable unless an environment preset
belong on the WorkloadClaim top-level intent such as `exposure=internal` or intentionally changes the Service shape. For standard self-service workloads,
`exposure=external`; the agent/template renderer turns that intent into runtime exposure is an admin preset decision and a user read view, not a user claim
Service resources. input.
Reusable components are still useful implementation building blocks: Reusable components are still useful implementation building blocks:
@ -101,8 +134,7 @@ Reusable components are still useful implementation building blocks:
- `components/service-loadbalancer`: change the app Service to `LoadBalancer`. - `components/service-loadbalancer`: change the app Service to `LoadBalancer`.
- `components/service-nodeport`: change the app Service to `NodePort`. - `components/service-nodeport`: change the app Service to `NodePort`.
When `exposure=external` is selected by the user, the agent/template renderer When a preset needs NodePort, it may leave the concrete nodePort for Kubernetes
may generate a NodePort Service and leave the concrete nodePort for Kubernetes
to allocate: to allocate:
```yaml ```yaml
@ -113,10 +145,13 @@ to allocate:
Do not hard-code shared NodePort values in app bases or reusable components. Do not hard-code shared NodePort values in app bases or reusable components.
Environment overlays are different from user choices. They are selected by the Environment overlays are different from user values. They are selected by the
platform from cluster, workspace, or customer policy information and can carry platform from cluster, workspace, or customer policy information and can carry
things like StorageClass, IngressClass, GPU runtime class, registry prefix, things like Service type, StorageClass, IngressClass, GPU runtime class,
pull-secret wiring, node selectors, tolerations, and site-specific labels. registry prefix, pull-secret wiring, node selectors, tolerations, and
site-specific labels. The user-facing exposure view is derived after reconcile:
ClusterIP is hidden from ordinary users, NodePort uses the agent access host and
observed nodePort, and LoadBalancer uses observed external IP/hostname and port.
## Validate ## Validate

View File

@ -1,17 +1,22 @@
# code-server # code-server
The base deploys code-server from `harbor.bwgdi.com/library/earth2studio-demo:v6` `base/` and `components/` stay plain Kustomize YAML for reusable validation and
with a Service and password Secret reference. composition.
The Secret is generated by the instance overlay. Workspace storage is mounted The OCDP self-service code-server flow is Helm-backed now:
from top-level WorkloadClaim `storage`, and exposure is rendered from top-level
`exposure`. Do not commit real passwords or tokens to this catalog.
The WorkloadTemplate exposes storage intent (`temporary`, `retained`, ```text
`existing`) and exposure intent (`internal`, `external`) as claim-time choices. Helm chart
Workspace storage defaults to retained. The template also mounts a hidden -> operator helm template
retained `weight` StorageClass PVC at `/models` for model weights; this -> packages/code-server/presets/k3s-hami
StorageClass detail is platform-owned and is not exposed as a user parameter. -> final Kubernetes resources
```
The Deployment keeps HAMi resource keys in `resources.limits` at all times: The environment-specific preset lives in
`nvidia.com/gpu` for GPU count and `nvidia.com/gpumem` for GPU memory in MiB. `packages/code-server/presets/k3s-hami`. That preset owns the platform image
`harbor.bwgdi.com/library/earth2studio-demo:v6`, registry pull secret, HAMi
scheduler, Service shape, code-server auth mode, and GPU resource keys.
Users only fill CPU, memory, GPU count, and GPU memory. See
`packages/code-server/presets/k3s-hami/userInputs.yaml` for the values schema
that the console should render into the WorkloadClaim form.

View File

@ -56,9 +56,9 @@ spec:
resources: resources:
requests: requests:
cpu: "500m" cpu: "500m"
memory: 1Gi memory: 1024Mi
limits: limits:
cpu: "2" cpu: "2000m"
memory: 4Gi memory: 4096Mi
nvidia.com/gpu: "1" nvidia.com/gpu: "1"
nvidia.com/gpumem: "8192" nvidia.com/gpumem: "8192"

View File

@ -0,0 +1,29 @@
# code-server Helm postRender
This package is consumed by OCDP as a Kustomize postRender for a Helm rendered
code-server chart.
The platform chain is:
```text
Helm chart + resolved values
-> helm template
-> packages/code-server/post-renders/k3s-hami
-> final Kubernetes resources
```
`post-renders/k3s-hami/kustomization.yaml` patches the Helm output with
environment-managed choices:
- image: `harbor.bwgdi.com/library/earth2studio-demo:v6`
- pull secret: `regcred`
- scheduler: `hami-scheduler`
- HAMi resource limit keys: `nvidia.com/gpu` and `nvidia.com/gpumem`
- NodePort Service on port `80`
- `weight` StorageClass PVC mounted at `/models`
`post-renders/k3s-hami/userInputs.yaml` is the user-facing value contract. Users
only choose CPU, memory, GPU count, and GPU memory. The console renders these
fields as the WorkloadClaim form; the operator receives the resolved values on
the Workload CR and applies the postRender patches. Storage, exposure, image,
scheduler, pull secret, and code-server auth mode stay in the admin postRender.

View File

@ -0,0 +1,119 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: "{{ namespace }}"
resources:
- rendered.yaml
- weights-pvc.yaml
patches:
- target:
group: apps
version: v1
kind: Deployment
name: "{{ name }}"
patch: |
apiVersion: apps/v1
kind: Deployment
metadata:
name: "{{ name }}"
labels:
app.kubernetes.io/component: ide
app.kubernetes.io/part-of: ocdp-workload
spec:
template:
metadata:
labels:
app.kubernetes.io/component: ide
app.kubernetes.io/part-of: ocdp-workload
spec:
imagePullSecrets:
- name: regcred
schedulerName: hami-scheduler
securityContext:
fsGroup: 1000
volumes:
- name: data
persistentVolumeClaim:
claimName: "{{ name }}"
- name: model-weights
persistentVolumeClaim:
claimName: "{{ name }}-weights"
containers:
- name: code-server
image: harbor.bwgdi.com/library/earth2studio-demo:v6
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 1000
command:
- code-server
args:
- --bind-addr
- 0.0.0.0:8080
- --auth
- none
- /workspace
ports:
- name: http
containerPort: 8080
protocol: TCP
env:
- name: HOME
value: /workspace
- name: XDG_CONFIG_HOME
value: /workspace/.config
readinessProbe:
httpGet:
path: /
port: http
livenessProbe:
httpGet:
path: /
port: http
resources:
requests:
cpu: "{{ cpuRequestMillicores }}m"
memory: "{{ memoryRequestMiB }}Mi"
limits:
cpu: "{{ cpuLimitMillicores }}m"
memory: "{{ memoryLimitMiB }}Mi"
nvidia.com/gpu: "{{ gpuCount }}"
nvidia.com/gpumem: "{{ gpuMemoryMiB }}"
volumeMounts:
- name: data
mountPath: /workspace
- name: model-weights
mountPath: /models
- target:
version: v1
kind: Secret
name: "{{ name }}"
patch: |
apiVersion: v1
kind: Secret
metadata:
name: "{{ name }}"
labels:
app.kubernetes.io/component: auth
app.kubernetes.io/part-of: ocdp-workload
annotations: {}
type: Opaque
data:
password: dW51c2Vk
- target:
version: v1
kind: Service
name: "{{ name }}"
patch: |
apiVersion: v1
kind: Service
metadata:
name: "{{ name }}"
labels:
app.kubernetes.io/component: ide
app.kubernetes.io/part-of: ocdp-workload
spec:
type: NodePort
ports:
- name: http
port: 80
targetPort: http
protocol: TCP

View File

@ -0,0 +1,35 @@
cpuRequestMillicores:
label: CPU request
type: number
default: 500
minimum: 0
cpuLimitMillicores:
label: CPU limit
type: number
default: 2000
minimum: 0
memoryRequestMiB:
label: Memory request
type: number
default: 1024
minimum: 0
memoryLimitMiB:
label: Memory limit
type: number
default: 4096
minimum: 0
gpuCount:
label: GPU count
type: number
default: 1
minimum: 0
gpuMemoryMiB:
label: GPU memory
type: number
default: 8192
minimum: 0

View File

@ -0,0 +1,18 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: "{{ name }}-weights"
namespace: "{{ namespace }}"
labels:
app.kubernetes.io/name: "{{ name }}"
app.kubernetes.io/component: model-weights
app.kubernetes.io/part-of: ocdp-workload
annotations:
platform.ocdp.io/storage-role: model-weights
spec:
accessModes:
- ReadWriteMany
storageClassName: weight
resources:
requests:
storage: 100Gi

View File

@ -1,15 +1,15 @@
- op: replace - op: replace
path: /spec/template/spec/containers/0/resources/requests/cpu path: /spec/template/spec/containers/0/resources/requests/cpu
value: "1" value: "1000m"
- op: replace - op: replace
path: /spec/template/spec/containers/0/resources/requests/memory path: /spec/template/spec/containers/0/resources/requests/memory
value: 2Gi value: 2048Mi
- op: replace - op: replace
path: /spec/template/spec/containers/0/resources/limits/cpu path: /spec/template/spec/containers/0/resources/limits/cpu
value: "2" value: "2000m"
- op: replace - op: replace
path: /spec/template/spec/containers/0/resources/limits/memory path: /spec/template/spec/containers/0/resources/limits/memory
value: 4Gi value: 4096Mi
- op: replace - op: replace
path: /spec/template/spec/containers/0/resources/limits/nvidia.com~1gpu path: /spec/template/spec/containers/0/resources/limits/nvidia.com~1gpu
value: "1" value: "1"