9 lines
231 B
Markdown
9 lines
231 B
Markdown
# vllm-server
|
|
|
|
OpenAI-compatible model serving with vLLM.
|
|
|
|
The base is CPU-safe YAML. Add `components/gpu-nvidia` in environments that
|
|
provide NVIDIA GPUs, and let the instance overlay patch model name, resources,
|
|
and cache size.
|
|
|