Files
VoxCPM/README_BW.md
2026-02-09 18:22:06 +08:00

2.5 KiB
Raw Permalink Blame History

VoxCPM-TTS

https://github.com/BoardWare-Genius/VoxCPM

📦 VoxCPM-TTS Version History

Version Date Summary
0.0.3 2026-02-09 Optimized configuration & Model support
0.0.2 2026-01-21 Supports streaming
0.0.1 2026-01-20 Initial version

🔄 Version Details

🆕 0.0.3 2026-02-09

  • Configuration & Deployment
    • Supports configuring model path via VOXCPM_MODEL_ID
    • Supports configuring CPU workers via VOXCPM_CPU_WORKERS
    • Supports configuring Uvicorn workers via VOXCPM_UVICORN_WORKERS

🆕 0.0.2 2026-01-21

  • Core Features
    • Update Model Weights, use VoxCPM1.5 and model parameters
    • Supports streaming

🆕 0.0.1 2026-01-20

  • Core Features
    • Initial VoxCPM-TTS

Start

docker pull harbor.bwgdi.com/library/voxcpmtts:0.0.3

# Run with custom configuration
# -e VOXCPM_MODEL_ID: Path to the model directory inside container
# -e VOXCPM_CPU_WORKERS: Number of threads for CPU-bound tasks
# -e VOXCPM_UVICORN_WORKERS: Number of uvicorn workers
# -e MAX_GPU_CONCURRENT: Max concurrent GPU tasks
docker run -d --restart always -p 5001:5000 --gpus '"device=0"' \
  -e VOXCPM_MODEL_ID="/models/VoxCPM1.5/" \
  -e VOXCPM_CPU_WORKERS="2" \
  -e VOXCPM_UVICORN_WORKERS="1" \
  -e MAX_GPU_CONCURRENT="1" \
  --mount type=bind,source=/Workspace/NAS11/model/Voice/VoxCPM,target=/models \
  harbor.bwgdi.com/library/voxcpmtts:0.0.3

Usage

Non-streaming

curl --location 'http://localhost:5001/generate_tts' \
--form 'text="你好,这是一段测试文本"' \
--form 'prompt_text="这是提示文本"' \
--form 'cfg_value="2.0"' \
--form 'inference_timesteps="10"' \
--form 'do_normalize="true"' \
--form 'denoise="true"' \
--form 'retry_badcase="true"' \
--form 'retry_badcase_max_times="3"' \
--form 'retry_badcase_ratio_threshold="6.0"' \
--form 'prompt_wav=@"/assets/2food16k_2.wav"'

Streaming

curl --location 'http://localhost:5001/generate_tts_streaming' \
--form 'text="你好,这是一段测试文本"' \
--form 'prompt_text="这是提示文本"' \
--form 'cfg_value="2.0"' \
--form 'inference_timesteps="10"' \
--form 'do_normalize="true"' \
--form 'denoise="true"' \
--form 'retry_badcase="true"' \
--form 'retry_badcase_max_times="3"' \
--form 'retry_badcase_ratio_threshold="6.0"' \
--form 'prompt_wav=@"/Workspace/NAS11/model/Voice/assets/2food16k_2.wav"'