Getting started

Streaming

Set stream: true to receive the completion as a series of server-sent events (SSE). Each event carries a chunk you can render the moment it arrives, in the same format OpenAI uses.

Enable streaming #

Add "stream": true to the request body. The response is a stream of chat-completion chunks; the content you want is in choices[0].delta.content on each chunk. The stream ends with a data: [DONE] sentinel.

curl https://api.merius.ai/v1/chat/completions \
  -H "Authorization: Bearer $MERIUS_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "qwen/qwen3-30b-a3b",
    "stream": true,
    "messages": [{"role": "user", "content": "Count to five"}]
  }'
stream = client.chat.completions.create(
    model="qwen/qwen3-30b-a3b",
    messages=[{"role": "user", "content": "Count to five"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
const stream = await client.chat.completions.create({
  model: "qwen/qwen3-30b-a3b",
  messages: [{ role: "user", content: "Count to five" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

The event format #

Over the wire, each event is a data: line containing one JSON chunk. Concatenating every delta.content in order reconstructs the full message:

Server-sent events
data: {"choices":[{"delta":{"role":"assistant","content":""}}]}

data: {"choices":[{"delta":{"content":"One"}}]}

data: {"choices":[{"delta":{"content":", two"}}]}

data: {"choices":[{"delta":{},"finish_reason":"stop"}]}

data: [DONE]

If you use an OpenAI SDK, it parses these events for you — iterate the stream as shown above and read delta.content.