Add the API documentation for streaming TTS (Text-to-Speech) (#6382)

This commit is contained in:
chenxu9741
2024-07-17 19:44:16 +08:00
committed by GitHub
parent f3f052ba36
commit a6dbd26f75
8 changed files with 137 additions and 35 deletions

View File

@@ -120,6 +120,16 @@ import { Row, Col, Properties, Property, Heading, SubProperty } from '../md.tsx'
- `metadata` (object) 元数据
- `usage` (Usage) 模型用量信息
- `retriever_resources` (array[RetrieverResource]) 引用和归属分段列表
- `event: tts_message` TTS 音频流事件语音合成输出。内容是Mp3格式的音频块使用 base64 编码后的字符串,播放的时候直接解码即可。(开启自动播放才有此消息)
- `task_id` (string) 任务 ID用于请求跟踪和下方的停止响应接口
- `message_id` (string) 消息唯一 ID
- `audio` (string) 语音合成之后的音频块使用 Base64 编码之后的文本内容,播放的时候直接 base64 解码送入播放器即可
- `created_at` (int) 创建时间戳1705395332
- `event: tts_message_end` TTS 音频流结束事件,收到这个事件表示音频流返回结束。
- `task_id` (string) 任务 ID用于请求跟踪和下方的停止响应接口
- `message_id` (string) 消息唯一 ID
- `audio` (string) 结束事件是没有音频的,所以这里是空字符串
- `created_at` (int) 创建时间戳1705395332
- `event: message_replace` 消息内容替换事件。
开启内容审查和审查输出内容时,若命中了审查条件,则会通过此事件替换消息内容为预设回复。
- `task_id` (string) 任务 ID用于请求跟踪和下方的停止响应接口
@@ -287,6 +297,8 @@ import { Row, Col, Properties, Property, Heading, SubProperty } from '../md.tsx'
data: {"event": "message", "message_id": : "5ad4cb98-f0c7-4085-b384-88c403be6290", "conversation_id": "45701982-8118-4bc5-8e9b-64562b4555f2", "answer": " meet", "created_at": 1679586595}
data: {"event": "message", "message_id": : "5ad4cb98-f0c7-4085-b384-88c403be6290", "conversation_id": "45701982-8118-4bc5-8e9b-64562b4555f2", "answer": " you", "created_at": 1679586595}
data: {"event": "message_end", "id": "5e52ce04-874b-4d27-9045-b3bc80def685", "conversation_id": "45701982-8118-4bc5-8e9b-64562b4555f2", "metadata": {"usage": {"prompt_tokens": 1033, "prompt_unit_price": "0.001", "prompt_price_unit": "0.001", "prompt_price": "0.0010330", "completion_tokens": 135, "completion_unit_price": "0.002", "completion_price_unit": "0.001", "completion_price": "0.0002700", "total_tokens": 1168, "total_price": "0.0013030", "currency": "USD", "latency": 1.381760165997548, "retriever_resources": [{"position": 1, "dataset_id": "101b4c97-fc2e-463c-90b1-5261a4cdcafb", "dataset_name": "iPhone", "document_id": "8dd1ad74-0b5f-4175-b735-7d98bbbb4e00", "document_name": "iPhone List", "segment_id": "ed599c7f-2766-4294-9d1d-e5235a61270a", "score": 0.98457545, "content": "\"Model\",\"Release Date\",\"Display Size\",\"Resolution\",\"Processor\",\"RAM\",\"Storage\",\"Camera\",\"Battery\",\"Operating System\"\n\"iPhone 13 Pro Max\",\"September 24, 2021\",\"6.7 inch\",\"1284 x 2778\",\"Hexa-core (2x3.23 GHz Avalanche + 4x1.82 GHz Blizzard)\",\"6 GB\",\"128, 256, 512 GB, 1TB\",\"12 MP\",\"4352 mAh\",\"iOS 15\""}]}}}
data: {"event": "tts_message", "conversation_id": "23dd85f3-1a41-4ea0-b7a9-062734ccfaf9", "message_id": "a8bdc41c-13b2-4c18-bfd9-054b9803038c", "created_at": 1721205487, "task_id": "3bf8a0bb-e73b-4690-9e66-4e429bad8ee7", "audio": "qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq"}
data: {"event": "tts_message_end", "conversation_id": "23dd85f3-1a41-4ea0-b7a9-062734ccfaf9", "message_id": "a8bdc41c-13b2-4c18-bfd9-054b9803038c", "created_at": 1721205487, "task_id": "3bf8a0bb-e73b-4690-9e66-4e429bad8ee7", "audio": ""}
```
</CodeGroup>
</Col>
@@ -898,29 +910,29 @@ import { Row, Col, Properties, Property, Heading, SubProperty } from '../md.tsx'
### Request Body
<Properties>
<Property name='message_id' type='str' key='text'>
Dify 生成的文本消息那么直接传递生成的message-id 即可,后台会通过 message_id 查找相应的内容直接合成语音信息。如果同时传 message_id 和 text优先使用 message_id。
</Property>
<Property name='text' type='str' key='text'>
语音生成内容。
语音生成内容。如果没有传 message-id的话则会使用这个字段的内容
</Property>
<Property name='user' type='string' key='user'>
用户标识,由开发者定义规则,需保证用户标识在应用内唯一。
</Property>
<Property name='streaming' type='bool' key='streaming'>
是否启用流式输出true、false。
</Property>
</Properties>
</Col>
<Col sticky>
<CodeGroup title="Request" tag="POST" label="/text-to-audio" targetCode={`curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n "text": "你好Dify",\n "user": "abc-123",\n "streaming": false\n}'`}>
<CodeGroup title="Request" tag="POST" label="/text-to-audio" targetCode={`curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n "message_id": "5ad4cb98-f0c7-4085-b384-88c403be6290",\n "text": "你好Dify",\n "user": "abc-123"\n}'`}>
```bash {{ title: 'cURL' }}
curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{
"message_id": "5ad4cb98-f0c7-4085-b384-88c403be6290",
"text": "你好Dify",
"user": "abc-123",
"streaming": false
"user": "abc-123"
}'
```