ocdp-go/docs/test2-quota.md

# Resource Quota Enforcement Test Report

**Date:** 2026-05-11
**Tester:** test-user-b
**Namespace:** ocdp-u-test-b
**User Quota:** cpu=2, memory=4Gi, gpu=0, gpumem=0

---

## Test Summary

| Test | Description | Expected | Actual | Result |
|------|-------------|----------|--------|--------|
| A | Deploy nginx (default, within quota) | Success | Deployed (status: `deployed`) | ✅ PASS |
| B | Deploy nginx (cpu=4, mem=8Gi, replicas=5, exceeds quota) | Blocked by quota | Helm release created, Service created, all pods blocked by ResourceQuota (status: `pending-install`) | ⚠️ PARTIAL |
| C | Deploy vllm-serve with gpu=1 (gpu quota = 0) | Blocked by quota | Helm release created, all pods blocked by ResourceQuota (status: `pending-install`) | ⚠️ PARTIAL |

---

## Detailed Results

### Test A: Deploy nginx within quota limits

- **Instance:** `quota-test-nginx` (ed846c33-3631-4d54-adce-c7f00210176f)
- **Chart:** charts/nginx:22.1.1
- **Values:** defaults
- **API Response:** HTTP 200, status: `pending-install`
- **Final Status after 21s:** `deployed` ("Instance deployed successfully")
- **K8s Resource Usage:** requests.cpu=100m/2, requests.memory=128Mi/4Gi

### Test B: Deploy nginx exceeding quota

- **Instance:** `quota-test-nginx-2` (36c0350f-089c-41c2-a66e-e93539c00d52)
- **Chart:** charts/nginx:22.1.1
- **Values:** replicaCount=5, resources.limits.cpu=4/memory=8Gi, resources.requests.cpu=2/memory=4Gi
- **API Response:** HTTP 200, status: `pending-install`
- **Final Status (observed for 90s+):** `pending-install` (never transitioned to `deployed` or `failed`)
- **K8s Behavior:**
  - Helm release created: `sh.helm.release.v1.quota-test-nginx-2.v1`
  - TLS secret created
  - Service created, IP assigned
  - Deployment created, ReplicaSet scaled up
  - **All pod creations FAILED** with: `Error creating: pods "..." is forbidden: exceeded quota: tenant-quota, requested: requests.cpu=2,requests.memory=4Gi, used: requests.cpu=100m,requests.memory=128Mi, limited: requests.cpu=2,requests.memory=4Gi`

### Test C: Deploy GPU instance (gpu quota = 0)

- **Instance:** `quota-test-gpu` (a0d692c8-cdf8-4248-a6d4-1468ad4a7cc7)
- **Chart:** charts/vllm-serve:0.6.0
- **Values:** resources.gpuLimit=1, resources.gpuMem=5000
- **API Response:** HTTP 200, status: `pending-install`
- **Final Status (observed for 30s+):** `pending-install`
- **K8s Behavior:**
  - vllm-serve chart defaults: requests.cpu=8, requests.memory=16Gi, requests.nvidia.com/gpu=1, requests.nvidia.com/gpumem=5k
  - All pods blocked: `exceeded quota: tenant-quota, requested: requests.cpu=8,requests.memory=16Gi,requests.nvidia.com/gpu=1,..., limited: requests.cpu=2,requests.memory=4Gi,requests.nvidia.com/gpu=0`

---

## Key Findings

### 1. No API-Level (Pre-flight) Quota Enforcement

The backend API accepts **all** deployment requests regardless of whether they exceed the user's quota. There is no validation at the API layer that checks:

- Whether the requested resources exceed the user's quota limits
- Whether the user's quota is already fully consumed by existing deployments

**Evidence:** All three deployments returned HTTP 200 with `status: pending-install`. The backend logs contain zero quota-related entries.

### 2. Kubernetes ResourceQuota Enforces at Pod Level

The Kubernetes `ResourceQuota` object `tenant-quota` in namespace `ocdp-u-test-b` does enforce limits, but only at the **pod creation** level:

```yaml
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 4Gi
    requests.nvidia.com/gpu: "0"
    requests.nvidia.com/gpumem: "0"
```

When pods exceed quota, Kubernetes explicitly refuses to create them with a clear error message.
However, Helm releases, Services, Deployments, and ReplicaSets are **still created** even when pods are blocked.

### 3. Stuck at "pending-install"

Instances that exceed quota remain stuck in `pending-install` status **indefinitely** — they never transition to `deployed`, `failed`, or any error status. The OCDP platform does not detect the ResourceQuota rejection and update the instance status accordingly. The only way to know about the failure is to check Kubernetes events directly:

```bash
kubectl get events -n ocdp-u-test-b
```

### 4. GPU Quota Enforcement

Users with `gpu=0` quota **can** submit deployments referencing GPU-enabled charts. The API does not reject them. Only the K8s ResourceQuota blocks pod creation at runtime. This could lead to:
- Unnecessary Helm releases and resource overhead in the cluster
- Confusion for users whose deployments appear to hang at `pending-install`

### 5. Quota Exposed in Login Response

The login response includes quota information:
```json
{
  "quotaCpu": "2",
  "quotaMemory": "4Gi",
  "quotaGpu": "0",
  "quotaGpuMemory": "0"
}
```
This could be used by the frontend to show usage limits, but no pre-flight check uses it server-side.

---

## Recommendations

1. **Add pre-flight quota validation** in the backend API: before accepting a deployment, check whether the requested resources (from chart values) would exceed the user's quota. Return HTTP 4xx with a clear error message.

2. **Handle "pending-install" timeout**: implement a watcher that detects when a Helm release has been created but pods remain stuck (e.g., due to ResourceQuota) and:
   - Update instance status to `failed` with a descriptive `statusReason`
   - Clean up the Helm release, Service, etc.
   - Optionally surface the K8s error message via the API

3. **GPU quota pre-check**: if a chart requests GPU resources and the user's `gpu=0`, reject the deployment at the API level before creating any Kubernetes resources.

4. **UI quota indicator**: show remaining quota (used vs. hard limit) on the deployment form so users know their limits before submitting.

---

## ResourceQuota YAML (for reference)

```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-quota
  namespace: ocdp-u-test-b
  labels:
    ocdp.io/managed-by: ocdp
    ocdp.io/tenant: ocdp-u-test-b
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 4Gi
    requests.nvidia.com/gpu: "0"
    requests.nvidia.com/gpumem: "0"
```

---

## Cleanup Verification

All test instances were removed after testing:
- `quota-test-nginx` ✅ deleted (pods terminated, helm release removed, quota back to 0)
- `quota-test-nginx-2` ✅ cleaned up (no pods created, resources released)
- `quota-test-gpu` ✅ cleaned up (no pods created, resources released)
- ResourceQuota used: all resources at 0