fix(service): enhance message filtering to drop empty chat messages while retaining tool requests
Some checks failed
CI / lint (push) Has been cancelled
CI / unit tests (push) Has been cancelled
CI / integration tests (push) Has been cancelled
CI / package build (push) Has been cancelled
Commit lint / pull request title (push) Has been cancelled
Commit lint / commit messages (push) Has been cancelled
Docs / links (push) Has been cancelled

This commit is contained in:
2026-06-16 16:18:24 +08:00
parent 0910affc78
commit b243018aff
14 changed files with 248 additions and 38 deletions

View File

@ -239,10 +239,10 @@ file (`episode-<YYYY-MM-DD>.md` etc.).
**`content`** — The message body.
- A bare **string** is shorthand for a single text content item.
- An **array of `ContentItem`** is for mixed-modality input (text +
image / pdf / audio / ...); non-text items are parsed by the
multimodal LLM configured via `EVEROS_MULTIMODAL__*` env vars. See
[ContentItem](#contentitem).
- An **array of `ContentItem`** is for mixed input (`text` / `md` +
image / pdf / audio / ...). `md` items are read as UTF-8 text;
media/document items are parsed by the multimodal LLM configured via
`EVEROS_MULTIMODAL__*` env vars. See [ContentItem](#contentitem).
**`tool_calls`** — When `role: "assistant"`, the tool calls the
assistant emitted in this turn (OpenAI Chat Completions shape).
@ -252,34 +252,38 @@ message is the response to.
### ContentItem
Mixed-modality message-body element. Carry the payload in exactly one
of `text` / `uri` / `base64`; the others must be `null`. For
`type: "text"` use `text`; for every **non-text** type use `uri`
(`http(s)://`) or `base64` (with `ext`). Non-text items are routed
through the multimodal parser, which needs a fetchable or decodable
payload — a non-text item carrying only `text` returns `415`.
Mixed message-body element. Carry the payload in exactly one of `text` /
`uri` / `base64`; the others must be `null`. For `type: "text"` use
`text`. For `type: "md"` use `text`, a server-local `file://` URI, or
`base64` UTF-8 bytes. For every **non-text, non-md** type use `uri`
(`http(s)://`) or `base64` (with `ext`). Non-text, non-md items are
routed through the multimodal parser, which needs a fetchable or
decodable payload — passing only `text` returns `415`.
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| `type` | `"text" \| "image" \| "audio" \| "doc" \| "pdf" \| "html" \| "email"` | yes | — | — |
| `text` | `string \| null` | no | `null` | Required when `type: "text"` |
| `type` | `"text" \| "md" \| "image" \| "audio" \| "doc" \| "pdf" \| "html" \| "email"` | yes | — | — |
| `text` | `string \| null` | no | `null` | Required when `type: "text"`; optional inline Markdown when `type: "md"` |
| `uri` | `string \| null` | no | `null` | `http(s)://` (fetched server-side) or `file://` (read from the server's local fs, guardrailed) pointer |
| `base64` | `string \| null` | no | `null` | Inline payload, plain base64 (no `data:` prefix) |
| `ext` | `string \| null` | no | `null` | File-extension hint when `uri` lacks one |
| `name` | `string \| null` | no | `null` | Display filename, used in logs |
| `extras` | `object \| null` | no | `null` | Provider-specific metadata, opaque to EverOS |
**`type`** — The content kind. Each non-text type is dispatched to the
multimodal LLM. If the multimodal endpoint cannot handle the supplied
payload, `/add` returns `415 Unsupported Media Type`.
**`type`** — The content kind. `text` and `md` are treated as text.
Each other type is dispatched to the multimodal LLM. If the multimodal
endpoint cannot handle the supplied payload, `/add` returns
`415 Unsupported Media Type`.
**`text`** — The literal text payload; valid **only** for
`type: "text"`. A non-text type (including `"html"`) is always routed
to the parser and must carry `uri` or `base64`; passing only `text` on
a non-text item returns `415`. To inline HTML as plain text, send it
as `type: "text"`.
**`text`** — The literal text payload; valid for `type: "text"` and
inline `type: "md"`. A non-text, non-md type (including `"html"`) is
always routed to the parser and must carry `uri` or `base64`; passing
only `text` on those items returns `415`. To inline HTML as plain text,
send it as `type: "text"`.
**`uri`** — `http(s)://` or `file://` pointer to the asset. An
**`uri`** — `http(s)://` or `file://` pointer to the asset. For
`type: "md"`, only `file://` is supported and the file is decoded as
UTF-8 text. For parser-backed content, an
`http(s)` uri is fetched by the server and dispatched by the response
Content-Type (use it for assets hosted elsewhere — S3 / OSS presigned
URL, http server). A `file://` uri is read from the **server's** local

View File

@ -251,6 +251,7 @@
"type": "string",
"enum": [
"text",
"md",
"image",
"audio",
"doc",