- Instance deployment: charts browser, deploy modal, instances list - Values Template version management (create/history/rollback) - Storage layered config (cluster > workspace > shared priority) - Cluster credential decryptIfNeeded for mixed encrypted/plaintext kubeconfig - YAML syntax validation (client-side + server-side warning) - Frontend: charts, instances, storage, templates, admin pages - Backend: storage service, instance service, cluster service, helm client - Multi-Tenant Kubeconfig.md: added by user
6.9 KiB
Technical Specification: Multi-Tenant Kubeconfig & Auth Gateway
1. System Overview & Goals
- Objective: Develop a backend API service that automates Kubernetes multi-tenant onboarding (Namespace + Quota isolation) and securely distributes short-lived, dynamic
kubeconfigfiles using the KubernetesTokenRequestAPI. - Architecture Independence: This backend service acts as a standalone control plane. It is not strictly bound to a BFF pattern and does not need to run inside the target Kubernetes cluster (it supports Out-of-Cluster execution).
- Out of Scope: This spec does NOT cover the frontend UI implementation or the downstream workload deployment. It focuses strictly on identity, tenant provisioning, and credential brokering.
- Security Principles: Adhere strictly to Zero-Knowledge architecture (no token storage in DB), Ephemeral Credentials (short-lived tokens only), and Least Privilege (the Gateway must NOT be a
cluster-admin).
2. Architecture & Topology
- Tech Stack: Go
net/http(or FastAPI), utilizing the official Kubernetes Client SDK (client-goorkubernetes-client/python). - Control Plane Flow:
- Client/Frontend -> Gateway: User requests environment access.
- Gateway -> K8s API: Gateway authenticates to the target K8s cluster using its own master credentials (e.g., an Out-of-Cluster
kubeconfig). - Gateway -> K8s API: Executes Namespace/SA creation (if new) or calls
TokenRequestAPI (if existing). - Gateway -> Client/Frontend: Returns a generated
kubeconfigYAML string with the short-lived JWT token.
3. Core Business Logic Workflows
Phase 1: Tenant Initialization (Onboarding)
Triggered when a new user registers or requests a workspace for the first time. The Gateway must execute a K8s transaction creating four resources:
- Namespace:
tenant-{user_uuid} - ServiceAccount:
sa-tenant-admin(Created inside the tenant's namespace). - RoleBinding: Bind
sa-tenant-adminto theadmin(or custom) ClusterRole, strictly isolated withintenant-{user_uuid}. - ResourceQuota: Enforce limits (e.g.,
requests.cpu: "4",limits.memory: "16Gi") to prevent noisy neighbors.
Phase 2: Credential Distribution (Dynamic Token)
Triggered when the user requests CLI access or downloads a kubeconfig.
- Locate the user's associated Namespace and ServiceAccount, verifying the user's ownership of the workspace.
- Audit Logging: Record the credential issuance event (User, IP, Workspace) into the database.
- Call the
authentication.k8s.io/v1 TokenRequestAPI targetingsa-tenant-adminin the specific tenant's namespace. - Set
expirationSeconds: 7200(2 hours). Hard limit; cannot be extended. - Retrieve the generated JWT token and inject it into a pre-defined
kubeconfigtext template.
Phase 3: Automated Renewal & Emergency Suspension
- Session Management: If accessed via a Web UI, the Gateway intercepts requests, attaches the dynamic token, and forwards them. If the token is within 10 minutes of expiration, the Gateway automatically issues a new TokenRequest.
- Emergency Suspension: If a workspace is marked compromised, the Gateway deletes its K8s
RoleBinding, instantly revoking access for all currently active tokens of that tenant.
4. API Contracts
4.1. Initialize Tenant Workspace
- Route:
POST /api/v1/workspaces/init - Auth: Gateway Session / Bearer Token
- Rate Limit: Strictly rate-limited per user to prevent Namespace exhaustion.
- Request Payload:
{ "tier": "basic" // Determines the ResourceQuota template } - Response Payload (201 Created):
{ "namespace": "tenant-a1b2c3d4", "status": "provisioned", "quota": {"cpu": "4", "memory": "8Gi"} }
4.2. Generate Dynamic Kubeconfig
- Route:
GET /api/v1/workspaces/credentials/kubeconfig - Auth: Gateway Session / Bearer Token
- Request Payload(200 OK): Returns raw
application/x-yamlcontent.apiVersion: v1 clusters: - cluster: server: https://<k8s-api-server> certificate-authority-data: <ca-base64> name: internal-cluster contexts: - context: cluster: internal-cluster namespace: tenant-a1b2c3d4 # Default context locked to their namespace user: sa-tenant-admin name: tenant-context current-context: tenant-context kind: Config users: - name: sa-tenant-admin user: token: "eyJhbGciOiJSUzI1NiIs..." # Short-lived token injected here
4.3. Suspend Workspace (Emergency Kill Switch)
- Route: POST /api/v1/workspaces/{id}/suspend
- Auth: Admin Only
- Behavior: Updates DB status to suspended and deletes the associated K8s RoleBinding.
5. Data Architecture & Persistence
-
Database: PostgreSQL (Relational mapping between Users and K8s Namespaces).
-
Table:
usersid(UUID, PK),email,password_hash,status
-
Table:
workspaces-
id(UUID, PK) -
user_id(UUID, FK to Users table) -
k8s_namespace(String, unique) -
k8s_sa_name(String) -
tier(String) -
created_at(Timestamp)
-
-
Table:
audit_logs(Security Compliance)id(UUID, PK),user_id(UUID),workspace_id(UUID),action(e.g., IssueKubeconfig),ip_address,created_at
-
Constraint: We do NOT store the K8s Token in the database. Tokens are ephemeral and generated on-the-fly.
6. Security, Threat Mitigation & Infrastructure Constraints
6.1 Threat Model
| Threat | Mitigation Strategy |
|---|---|
| Gateway Compromise | The Gateway uses a strictly restricted K8s role. It cannot read existing Secrets or interfere with other tenants' running Pods. |
| Token Theft (XSS) | Application-level Auth must use HttpOnly, Secure Cookies. Generated Kubeconfigs expire in 2 hours. |
| Resource Abuse (Mining) | Hardcoded ResourceQuota per tenant upon creation. Global LimitRange enforced at the cluster level. |
6.2 Restricted Gateway Credentials (Crucial)
The Gateway requires a K8s credential (Out-of-Cluster kubeconfig or Cloud IAM Role) to operate. This credential MAY NOT have cluster-admin privileges. It should be bound to a custom ClusterRole with ONLY the following permissions:
create,get,listonnamespaces,resourcequotas.create,get,listonserviceaccounts,rolebindings.createonserviceaccounts/token(CRITICAL for TokenRequest API).- Strictly prohibited:
getorlistonsecrets,pods, ordeployments.
6.3 Deployment & Networking
- Deployment Agnostic: The application will be packaged as a Docker image and can be deployed via Docker Compose, standalone VMs, or within a Kubernetes cluster.
- CORS/CSP: Since this might not be a single-origin BFF, explicit CORS policies (
Access-Control-Allow-Origin) must be tightly defined if the frontend is hosted on a separate domain. Wildcards (*) are prohibited.