Skip to main content

POST /v1/chat/completions

The primary chat endpoint. Fully OpenAI compatible. Base URL
POST https://api.openference.com/v1/chat/completions

Minimal example

{
  "model": "your-model-name",
  "messages": [
    {"role": "user", "content": "Explain edge computing in one sentence."}
  ]
}

Authentication

Authorization: Bearer sk-token-...

Streaming

{
  "model": "your-model-name",
  "messages": [{"role": "user", "content": "Count to 5 slowly."}],
  "stream": true
}
You receive standard OpenAI SSE chunks ending in data: [DONE].

Tool calling (function calling)

Request:
{
  "model": "your-model-name",
  "messages": [{"role": "user", "content": "What's the weather in Paris?"}],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}
The assistant message may contain tool_calls. You then send back a tool role message with tool_call_id. Openference passes these through (or converts) to the upstream format automatically.

Temperature, max_tokens, top_p, etc.

All standard sampling parameters are forwarded.

Format conversion

If the upstream for your chosen model uses Anthropic or Gemini, Openference converts the request and response shapes transparently. You can send OpenAI format to a Claude model and get back OpenAI-shaped output.

Model restrictions & 403

If the key is restricted, only allowed models accept traffic.

See also