Files
VoxCPM/README_BW.md
2026-02-09 18:22:06 +08:00

84 lines
2.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# VoxCPM-TTS
https://github.com/BoardWare-Genius/VoxCPM
## 📦 VoxCPM-TTS Version History
| Version | Date | Summary |
|---------|------------|---------------------------------|
| 0.0.3 | 2026-02-09 | Optimized configuration & Model support |
| 0.0.2 | 2026-01-21 | Supports streaming |
| 0.0.1 | 2026-01-20 | Initial version |
### 🔄 Version Details
#### 🆕 0.0.3 *2026-02-09*
-**Configuration & Deployment**
- Supports configuring model path via `VOXCPM_MODEL_ID`
- Supports configuring CPU workers via `VOXCPM_CPU_WORKERS`
- Supports configuring Uvicorn workers via `VOXCPM_UVICORN_WORKERS`
#### 🆕 0.0.2 *2026-01-21*
-**Core Features**
- Update Model Weights, use VoxCPM1.5 and model parameters
- Supports streaming
#### 🆕 0.0.1 *2026-01-20*
-**Core Features**
- Initial VoxCPM-TTS
---
# Start
```bash
docker pull harbor.bwgdi.com/library/voxcpmtts:0.0.3
# Run with custom configuration
# -e VOXCPM_MODEL_ID: Path to the model directory inside container
# -e VOXCPM_CPU_WORKERS: Number of threads for CPU-bound tasks
# -e VOXCPM_UVICORN_WORKERS: Number of uvicorn workers
# -e MAX_GPU_CONCURRENT: Max concurrent GPU tasks
docker run -d --restart always -p 5001:5000 --gpus '"device=0"' \
-e VOXCPM_MODEL_ID="/models/VoxCPM1.5/" \
-e VOXCPM_CPU_WORKERS="2" \
-e VOXCPM_UVICORN_WORKERS="1" \
-e MAX_GPU_CONCURRENT="1" \
--mount type=bind,source=/Workspace/NAS11/model/Voice/VoxCPM,target=/models \
harbor.bwgdi.com/library/voxcpmtts:0.0.3
```
# Usage
## Non-streaming
```bash
curl --location 'http://localhost:5001/generate_tts' \
--form 'text="你好,这是一段测试文本"' \
--form 'prompt_text="这是提示文本"' \
--form 'cfg_value="2.0"' \
--form 'inference_timesteps="10"' \
--form 'do_normalize="true"' \
--form 'denoise="true"' \
--form 'retry_badcase="true"' \
--form 'retry_badcase_max_times="3"' \
--form 'retry_badcase_ratio_threshold="6.0"' \
--form 'prompt_wav=@"/assets/2food16k_2.wav"'
```
## Streaming
```bash
curl --location 'http://localhost:5001/generate_tts_streaming' \
--form 'text="你好,这是一段测试文本"' \
--form 'prompt_text="这是提示文本"' \
--form 'cfg_value="2.0"' \
--form 'inference_timesteps="10"' \
--form 'do_normalize="true"' \
--form 'denoise="true"' \
--form 'retry_badcase="true"' \
--form 'retry_badcase_max_times="3"' \
--form 'retry_badcase_ratio_threshold="6.0"' \
--form 'prompt_wav=@"/Workspace/NAS11/model/Voice/assets/2food16k_2.wav"'
```