Welcome to Openference
Openference is a curated AI model proxy. It provides a single, stable, OpenAI-compatible API endpoint that intelligently routes your requests to the best available upstream providers. One base URL. One API key. Many models.Why Openference?
- Drop-in compatible: Works with Cursor, Claude Code, Codex, Continue, Cline, OpenAI SDKs, LiteLLM, LangChain, and any OpenAI-compatible client.
- Curated catalog: Hand-picked high-quality models across providers. We handle deprecations so you don’t have to.
- Automatic failover & key rotation: If one upstream provider is down or rate-limited, we route to healthy alternatives.
- Transparent pricing: Per-million-token input/output prices shown for every model. No surprise bills.
- Usage separation: Plan quotas vs pay-as-you-go credits are tracked distinctly.
- Format translation: Send OpenAI format to Anthropic models (and vice-versa). We convert for you.
- Fast & global: Runs on Cloudflare’s edge network for low latency worldwide.
Supported API surfaces
| Endpoint | Format | Notes |
|---|---|---|
POST /v1/chat/completions | OpenAI | Primary chat interface, streaming, tools |
POST /v1/messages | Anthropic | Claude-style messages |
POST /v1/responses | OpenAI | Responses API (Codex etc.) |
POST /v1/embeddings | OpenAI | Text embeddings |
GET /v1/models | OpenAI | Live model list (filtered by your key’s restrictions) |
Get started in 60 seconds
- Register and verify your email.
- Log in and create an API key (starts with
sk-token-). - Send your first request:
Next steps
- Read Base URL & Authentication
- Browse Client Integrations (start with Cursor)
- Explore the live Models catalog
- Review Pricing & Plans

