This commit is contained in:
147
readme_bw.md
Normal file
147
readme_bw.md
Normal file
@ -0,0 +1,147 @@
|
||||
# FunASR Dual-Mode API
|
||||
|
||||
This is a speech recognition (ASR) service built on FastAPI, integrating two inference modes of FunASR to provide flexible speech transcription capabilities.
|
||||
|
||||
## Features
|
||||
|
||||
The service provides two main inference interfaces:
|
||||
|
||||
1. **AutoModel Mode (`/inference/funasr`)**:
|
||||
* Uses the `funasr.AutoModel` high-level interface.
|
||||
* Integrates VAD (Voice Activity Detection).
|
||||
* Supports Hotwords enhancement.
|
||||
* Supports ITN (Inverse Text Normalization).
|
||||
* Supports multi-language configuration.
|
||||
|
||||
2. **Direct Model Mode (`/inference/direct`)**:
|
||||
* Directly calls the underlying `FunASRNano` model.
|
||||
* Supports standard full inference.
|
||||
* Supports simulated streaming/chunk inference (Chunk Mode) for testing the model's incremental decoding capabilities.
|
||||
|
||||
## Environment Setup
|
||||
|
||||
### Dependency Installation
|
||||
|
||||
This project uses `uv` for dependency management. Please ensure `uv` is installed, then run the following command in the project root directory:
|
||||
|
||||
```bash
|
||||
uv sync
|
||||
```
|
||||
|
||||
### Model Configuration
|
||||
|
||||
The default model path is configured as `/models/Fun-ASR-Nano-2512`. If your model is located elsewhere, please set the environment variable `MODEL_DIR`:
|
||||
|
||||
```bash
|
||||
export MODEL_DIR="/your/absolute/path/to/model"
|
||||
```
|
||||
|
||||
## Start Service
|
||||
|
||||
You can start the service directly using the uv script (default port 5000):
|
||||
|
||||
```bash
|
||||
uv run api.py
|
||||
```
|
||||
|
||||
The service will automatically detect the computing device (CUDA > MPS > CPU) upon startup.
|
||||
|
||||
### Docker Startup
|
||||
|
||||
If deploying with Docker, you can refer to the following command. You can specify a custom model path using `-e MODEL_DIR`:
|
||||
|
||||
```bash
|
||||
docker run -d --restart always -p 5000:5000 --gpus "device=1" \
|
||||
-e MODEL_DIR="/models/Fun-ASR-Nano-2512" \
|
||||
--mount type=bind,source=/your/path/model/Fun-ASR-Nano-2512,target=/models/Fun-ASR-Nano-2512 \
|
||||
harbor.bwgdi.com/library/fun-asr:0.0.1
|
||||
```
|
||||
|
||||
## API Documentation
|
||||
|
||||
### 1. FunASR Standard Inference Interface
|
||||
|
||||
* **URL**: `/inference/funasr`
|
||||
* **Method**: `POST`
|
||||
* **Content-Type**: `multipart/form-data`
|
||||
|
||||
| Parameter Name | Type | Required | Default | Description |
|
||||
| :--- | :--- | :--- | :--- | :--- |
|
||||
| `file` | File | Yes | - | Audio file |
|
||||
| `language` | String | No | "中文" | Target language |
|
||||
| `itn` | String | No | "true" | Whether to enable Inverse Text Normalization (true/false) |
|
||||
| `hotwords` | String | No | "" | List of hotwords to improve recognition rate of specific vocabulary |
|
||||
|
||||
**Example**:
|
||||
```bash
|
||||
curl -X POST "http://127.0.0.1:5000/inference/funasr" \
|
||||
-F "file=@/path/to/audio.wav" \
|
||||
-F "hotwords=开放时间"
|
||||
```
|
||||
|
||||
### 2. Direct Underlying Inference Interface
|
||||
|
||||
* **URL**: `/inference/direct`
|
||||
* **Method**: `POST`
|
||||
* **Content-Type**: `multipart/form-data`
|
||||
|
||||
| Parameter Name | Type | Required | Default | Description |
|
||||
| :--- | :--- | :--- | :--- | :--- |
|
||||
| `file` | File | Yes | - | Audio file |
|
||||
| `chunk_mode` | Boolean | No | False | Whether to enable chunk simulation mode (true/false) |
|
||||
|
||||
**Example**:
|
||||
```bash
|
||||
# Enable chunk simulation mode
|
||||
curl -X POST "http://127.0.0.1:5000/inference/direct" \
|
||||
-F "file=@/path/to/audio.wav" \
|
||||
-F "chunk_mode=true"
|
||||
```
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"mode": "direct",
|
||||
"text": {
|
||||
"key": "rand_key_WgNZq6ITZM5jt",
|
||||
"text": "你好。",
|
||||
"text_tn": "你好",
|
||||
"label": "null",
|
||||
"ctc_text": "你好",
|
||||
"ctc_timestamps": [
|
||||
{
|
||||
"token": "你",
|
||||
"start_time": 1.8,
|
||||
"end_time": 1.86,
|
||||
"score": 0.908
|
||||
},
|
||||
{
|
||||
"token": "好",
|
||||
"start_time": 2.16,
|
||||
"end_time": 2.22,
|
||||
"score": 0.988
|
||||
}
|
||||
],
|
||||
"timestamps": [
|
||||
{
|
||||
"token": "你",
|
||||
"start_time": 1.8,
|
||||
"end_time": 1.86,
|
||||
"score": 0.908
|
||||
},
|
||||
{
|
||||
"token": "好",
|
||||
"start_time": 2.16,
|
||||
"end_time": 2.22,
|
||||
"score": 0.988
|
||||
},
|
||||
{
|
||||
"token": "。",
|
||||
"start_time": 2.88,
|
||||
"end_time": 2.94,
|
||||
"score": 0.0
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user