REST API

All public Messages API v3 endpoints are under:

https://msg.hidoba.com

Authentication

Send a quota API key with each generation request. Bearer auth is preferred.

curl https://msg.hidoba.com/v3/chat/completions \
  -H "Authorization: Bearer YOUR_QUOTA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/gemini-2.5-flash","messages":[{"role":"user","content":"Say hello"}]}'

X-API-Key is also accepted:

X-API-Key: YOUR_QUOTA_API_KEY

Health

GET /health
GET /health/deep

Health endpoints are intended for uptime checks and deployment verification.

Chat Completions

POST /v3/chat/completions

The request body follows the OpenAI Chat Completions shape. Messages API v3 also accepts a small number of Hidoba-specific additions.

Field	Type	Required	Description
`model`	String	Yes	Requested model name, for example `google/gemini-2.5-flash`.
`messages`	Array	Yes	OpenAI-compatible chat messages. Use `assistant` for prior chatbot messages and `user` for user messages.
`stream`	Boolean	No	Set to `true` for streaming responses.
`max_completion_tokens`	Number	No	Preferred output-token limit for compatible models.
`max_tokens`	Number	No	Compatibility output-token limit.
`reasoning`	Object	No	Optional thinking/reasoning control for supported models. See Reasoning.
`fallback_model`	String	No	Optional fallback model to try when supported. This is a Hidoba routing option, not part of the model conversation.
`metadata.hidoba`	Object	No	Hidoba character metadata. See Hidoba Metadata.

Example

curl https://msg.hidoba.com/v3/chat/completions \
  -H "Authorization: Bearer $HIDOBA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "messages": [
      { "role": "user", "content": "Hi, can you help me write concise replies?" },
      { "role": "assistant", "content": "Yes. I will keep replies brief and clear." },
      { "role": "user", "content": "Reply with exactly: hello" }
    ],
    "max_completion_tokens": 32
  }'

The response is an OpenAI-compatible chat completion response.

Responses API

POST /v3/responses

The request body follows the OpenAI Responses API shape.

Field	Type	Required	Description
`model`	String	Yes	Requested model name.
`input`	String or Array	Yes	Responses API input.
`instructions`	String	No	Additional system instructions. Character prompts and knowledge context may be added when configured.
`stream`	Boolean	No	Set to `true` for streaming responses when supported.
`max_output_tokens`	Number	No	Output-token limit for Responses API requests.
`reasoning`	Object	No	Optional thinking/reasoning control for supported models. See Reasoning.
`fallback_model`	String	No	Optional fallback model to try when supported.
`metadata.hidoba`	Object	No	Hidoba character metadata. See Hidoba Metadata.

Example

curl https://msg.hidoba.com/v3/responses \
  -H "Authorization: Bearer $HIDOBA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "input": "Write one concise sentence about Mars.",
    "max_output_tokens": 80
  }'

The response is an OpenAI-compatible Responses API response.

Responses API Visibility

When a /v3/responses request uses a character with knowledge context enabled, Hidoba adds OpenAI-shaped visibility items to the Responses output. These items are intended for user interfaces that want to show what is happening before the final answer arrives.

This visibility is added only for /v3/responses. Chat Completions responses keep the normal Chat Completions shape.

Output Items

When knowledge context runs, the Responses output array includes Hidoba-added items before the model answer:

A reasoning item saying that Hidoba is looking through the knowledge base.
A file_search_call item with the searches and retrieved source previews.
A reasoning item saying how many relevant documents were found.
The normal model output items.

Example:

{
  "output": [
    {
      "id": "rs_hidoba_rag_loading_123",
      "type": "reasoning",
      "summary": [
        {
          "type": "summary_text",
          "text": "Looking through the knowledge base..."
        }
      ]
    },
    {
      "id": "fs_hidoba_123",
      "type": "file_search_call",
      "status": "completed",
      "queries": ["have you ever had a horse?"],
      "results": [
        {
          "file_id": "file_hidoba_07d6ebeddf6a421c",
          "filename": "Stephen Wolfram's Personal History with Animals and the Concept of Video Games for Pets",
          "score": 0.111,
          "text": "Q&A about Business, Innovation, and Managing Life with Stephen Wolfram...",
          "attributes": {
            "chunk_index": 0,
            "retrievers": "dense,splade"
          }
        }
      ]
    },
    {
      "id": "rs_hidoba_rag_found_123",
      "type": "reasoning",
      "summary": [
        {
          "type": "summary_text",
          "text": "Found 1 relevant documents, thinking..."
        }
      ]
    }
  ]
}

If knowledge context runs and finds no sources, Hidoba still returns a completed file_search_call with results: [] and a status message such as Found 0 relevant documents, thinking....

Source Result Fields

Each file_search_call.results[] item may include:

Field	Type	Description
`file_id`	String	Stable public source identifier for the returned result.
`filename`	String	Human-readable source or document title.
`score`	Number	Relevance score when available.
`text`	String	Short source preview. Full knowledge chunks are not returned in public Responses output.
`attributes`	Object	Safe source metadata, such as `chunk_index` and `retrievers`, when available.

The text field is a compact preview only. Hidoba trims surrounding whitespace, collapses repeated whitespace, and returns a short snippet of the retrieved chunk, currently up to 200 characters.

Streaming Events

For streaming /v3/responses calls, the same information is emitted as Responses stream events before the model answer.

Typical event order:

response.created
response.in_progress
response.output_item.added        # reasoning: looking through knowledge base
response.output_item.done
response.file_search_call.in_progress
response.file_search_call.searching
response.file_search_call.completed
response.output_item.added        # reasoning: found N documents
response.output_item.done
...model output events...
response.completed

Render response.file_search_call.completed.item.results as the source list. Render Hidoba reasoning.summary[].text as status text. Hidoba-added item IDs use identifiable prefixes such as rs_hidoba_... and fs_hidoba_....

Model Thinking Events

Some models also emit their own reasoning or thinking output when you use the reasoning request field. This is separate from Hidoba's knowledge-status messages.

In Responses streams, model thinking may appear as:

response.reasoning_text.delta
response.reasoning_text.done
response.content_part.done with part.type: "reasoning_text"
response.output_item.done for an item with type: "reasoning" and content[].type: "reasoning_text"

User interfaces should treat these as reasoning/thinking text, not as final answer text. Preserve whitespace when displaying reasoning text so headings and paragraphs remain readable.

Reasoning

Use reasoning when the requested model supports explicit thinking controls. Messages API v3 accepts the object and applies it to the model request; it is not used for Hidoba quota logic, character rendering, or knowledge retrieval.

Reasoning is separate from the answer output limit:

Chat Completions output limit: max_tokens or max_completion_tokens
Responses API output limit: max_output_tokens
Reasoning budget: reasoning.max_tokens

If reasoning is omitted, no explicit reasoning instruction is sent and the model default applies.

Turn reasoning off:

{
  "reasoning": { "effort": "none" }
}

Use this shape to turn reasoning off.

Use an effort level:

{
  "reasoning": { "effort": "low" }
}

The commonly supported effort values are none, low, medium, and high. To let the model choose its default reasoning behavior, omit the reasoning field.

Use a custom thinking-token budget:

{
  "reasoning": { "max_tokens": 1024 }
}

Use this shape when you want to set an explicit reasoning-token budget.

Chat Completions example with low reasoning effort:

curl https://msg.hidoba.com/v3/chat/completions \
  -H "Authorization: Bearer $HIDOBA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "messages": [
      { "role": "user", "content": "I am comparing retrieval methods." },
      { "role": "assistant", "content": "Got it. I can compare them clearly and briefly." },
      { "role": "user", "content": "Explain semantic search in one concise paragraph." }
    ],
    "max_tokens": 120,
    "reasoning": { "effort": "low" }
  }'

Responses API example with a custom reasoning budget:

curl https://msg.hidoba.com/v3/responses \
  -H "Authorization: Bearer $HIDOBA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "input": "Compare semantic search and keyword search in three bullets.",
    "max_output_tokens": 180,
    "reasoning": { "max_tokens": 1024 }
  }'

Streaming uses the same request fields:

{
  "model": "google/gemini-2.5-flash",
  "messages": [
    { "role": "user", "content": "I want a short answer." },
    { "role": "assistant", "content": "Understood. I will be concise." },
    { "role": "user", "content": "What does a sunrise usually symbolize?" }
  ],
  "stream": true,
  "max_tokens": 120,
  "reasoning": { "effort": "medium" }
}

Reasoning support depends on the selected model. Unsupported combinations may ignore the field or return a model error.

Errors

Generation requests may fail before reaching the model if the API key, quota, character, or metadata is invalid.

Status	Meaning
`400`	Invalid request body, invalid `metadata.hidoba`, or character validation failure.
`401`	Missing or invalid quota API key.
`403`	Quota type or access is not allowed for this endpoint.
`429`	Quota is exhausted.
`5xx`	Model or service failure.

Example error:

{
  "detail": {
    "code": "invalid_api_key",
    "message": "Invalid API key",
    "context": {
      "request_id": "7ed3d6f1-1a9f-4f3b-90d3-7c681a9d9fb8",
      "endpoint": "/v3/chat/completions"
    }
  }
}

Model errors are returned in an OpenAI-compatible response shape when possible.

Authentication​

Health​

Chat Completions​

Example​

Responses API​

Example​

Responses API Visibility​

Output Items​

Source Result Fields​

Streaming Events​

Model Thinking Events​

Reasoning​

Errors​

Authentication

Health

Chat Completions

Example

Responses API

Example

Responses API Visibility

Output Items

Source Result Fields

Streaming Events

Model Thinking Events

Reasoning

Errors