FAQ
Getting started
What is Openference?
Openference is a curated AI model proxy service. It gives you a single OpenAI-compatible API endpoint that routes requests to models from OpenAI, Anthropic, Gemini, and other providers. You get one base URL, one API key, and transparent per-token billing — no need to juggle multiple provider accounts.What base URL do I use?
Set your OpenAI-compatible client base URL tohttps://api.openference.com/v1.
How do I get an API key?
Register for an account at openference.com, verify your email, log in to your dashboard, and create an API key under the API Keys section. Keys use thesk-token- prefix and can be created instantly.
Is Openference compatible with the OpenAI SDK?
Yes — it is a drop-in replacement. Point any OpenAI SDK client (Python, Node.js, or any OpenAI-compatible tool) tohttps://api.openference.com/v1 with your API key. Change one line of code and you are live.
Which API endpoints does Openference support?
| Endpoint | Description |
|---|---|
POST /v1/chat/completions | OpenAI-format chat completions |
POST /v1/messages | Anthropic-format messages |
POST /v1/responses | Responses API |
POST /v1/embeddings | OpenAI-format embeddings |
GET /v1/models | List available models |
Do I need separate accounts with each AI provider?
No. Openference manages upstream provider keys for you. You only need an Openference account. We handle key rotation, failover, and provider routing behind the scenes.IDE & CLI setup
Does Cursor work with Openference?
Yes. In Cursor Settings → Models, enable Override OpenAI Base URL, set it tohttps://api.openference.com/v1, and paste your API key. Cursor’s Verify button calls GET /v1/models, which we fully support.
See the dedicated Cursor guide.
Does Claude Code work?
Yes. SetANTHROPIC_BASE_URL to https://api.openference.com/v1 and ANTHROPIC_API_KEY to your API key. Claude Code calls POST /v1/messages. Models configured with both OpenAI and Anthropic formats work with Cursor and Claude Code on the same key.
See Claude Code.
Does Codex CLI work?
Yes. SetOPENAI_BASE_URL to https://api.openference.com/v1 and OPENAI_API_KEY to your API key. Codex may also use the Responses API at POST /v1/responses, which we forward to upstream providers.
See Codex CLI.
Does OpenCode CLI work?
Yes. Add an Openference custom provider in~/.config/opencode/opencode.json with npm @ai-sdk/openai-compatible and options.baseURL set to https://api.openference.com/v1.
Full example and auth flow in the OpenCode guide.
Does Continue (VS Code extension) work?
Yes. In your Continue config (~/.continue/config.json):
Does Cline work?
Yes. In Cline settings in VS Code, select API Provider → OpenAI Compatible, then enterhttps://api.openference.com/v1 as the base URL and your API key.
See Cline.
Can I use the same API key across multiple tools?
Yes. A single Openference API key works across Cursor, Claude Code, Codex CLI, OpenCode CLI, Continue, Cline, and any OpenAI SDK application simultaneously. You can also create separate keys for different tools if you prefer to track usage per tool.Models
Which models are available?
We offer a curated pool of models across OpenAI, Anthropic, and Gemini providers. See the Models page for current availability, per-token pricing, and provider information. Our catalog is regularly updated as new models are released.How are models selected and curated?
Our team hand-picks high-quality models and regularly evaluates them. We surface the best-performing models and deprecate underperformers, so you do not need to track model releases across multiple providers.Can I switch models without changing my code?
Yes. Since all models are accessed through the same/v1 endpoint, you only need to change the model name in your request. No provider-specific integration changes are required.
Do you support model-specific features like tool calling and streaming?
Yes. We pass through tool calling, streaming (SSE), and other model capabilities to the upstream provider. As long as the upstream model supports a feature, it works through Openference.Billing & pricing
How is usage counted?
We count successful requests and client errors toward your daily limits. Upstream failures (when our providers are down) do not count against your quota. Each request is logged with model, token counts, latency, and cost for full transparency.How does per-token pricing work?
Each model has input and output prices per million tokens. Costs are calculated automatically based on actual token usage and displayed in your dashboard. You can see per-request costs in your usage history.What subscription plans are available?
We offer Free, paid monthly, and annual plans. Annual billing saves approximately 17%. Each plan includes different RPM/RPD limits and model access. See the Pricing page for current plan details.What are credits and how do they work?
Credits are a pay-as-you-go option on top of your subscription. When your plan limits are reached, credits allow continued usage. You can purchase credit packages from the dashboard. Credit consumption is tracked separately from plan usage.Can I cancel my subscription anytime?
Yes. Use “Adjust plan” or the plan selector in billing to switch plans in-app (existing subscribers get an instant change with proration or at next renewal). You can also use the Stripe customer portal via “Manage in Stripe” for full subscription management.Where can I see my billing history and invoices?
Log in to your dashboard and navigate to the Billing section. You can view your current plan, usage history, credit balance, payment history, and download invoices.Rate limits & quotas
What are RPM and RPD limits?
RPM (Requests Per Minute) limits how many API calls you can make in a 60-second window. RPD (Requests Per Day) limits total calls in a 24-hour period. Your plan determines these limits. Free tier users have default limits shown in the dashboard.What happens when I hit my rate limit?
When you exceed your RPM or RPD limit, the API returns a429 Too Many Requests response. The limit resets at the end of the current window (minute or day). Upgrading your plan or purchasing credits increases your limits.
How are rate limits enforced?
Rate limits use sliding time windows for accuracy. Each API key has independent limits. Limits are checked per-request with minimal latency overhead.API keys & security
Can I have multiple API keys?
Yes. Create multiple keys in the API Keys section of your dashboard. Each key can have its own name, optional model restrictions (limit which models it can access), and independent usage tracking.What format do API keys use?
API keys use thesk-token- prefix (e.g., sk-token-abc123...).
How do I restrict which models an API key can access?
When creating or editing an API key in the dashboard, you can set model restrictions. The key will only be able to call the models you specify. TheGET /v1/models endpoint also filters the model list based on key restrictions.
What should I do if my API key is compromised?
Immediately delete the compromised key from your dashboard and create a new one. API keys can be revoked instantly. We recommend rotating keys periodically as a security best practice.How are passwords stored?
User passwords are stored using one-way cryptographic hashing. We never store plaintext passwords.Privacy & data
Do you store my prompts or completions?
We log request metadata (model, token counts, latency, cost) for billing and analytics. Prompt and completion content is forwarded to upstream providers and is not stored long-term by Openference.Do upstream providers see my data?
Yes. Your prompts and completions are forwarded to the upstream AI provider (OpenAI, Anthropic, Gemini, etc.) that serves your request. Refer to each provider’s privacy policy for how they handle your data.Is my data encrypted in transit?
Yes. All connections use HTTPS/TLS. Data is encrypted in transit between your client and Openference, and between Openference and upstream providers.Where can I find your privacy policy and terms?
Our Privacy Policy, Terms of Service, Cookie Policy, and Data Processing Agreement are available in the footer of every page on our website.Reliability & architecture
What happens if an upstream provider is down?
Openference uses automatic failover. If one upstream provider returns errors, we route to an alternative. Upstream failures do not count against your quota.How does key rotation work?
We maintain redundant upstream capacity organized by model. When a provider or route returns errors, traffic is automatically redirected. The system recovers routes once they are healthy again.Where is Openference hosted?
Openference runs on globally distributed cloud infrastructure, giving you low-latency access from anywhere in the world.What is the expected latency overhead?
Openference adds minimal overhead — typically single-digit milliseconds for routing, format conversion, and rate limit checks. The dominant factor in response time is the upstream AI provider’s processing time.How does format conversion work between providers?
Openference automatically translates between OpenAI, Anthropic Claude, and Google Gemini request/response formats. You can send an OpenAI-format request and have it routed to an Anthropic model — the conversion is handled transparently.Account & email
How do I verify my email?
After registering, you will receive a verification email. Click the link in the email to verify your account. If you do not receive the email, check your spam folder or use the Resend Verification option on the login page.I did not receive the verification email. What should I do?
Check your spam or junk folder first. If it is not there, go to the login page and click Resend Verification. Enter your email address to receive a new verification link. If the problem persists, contact support.How do I reset my password?
Use the forgot password flow on the login page. You will receive an email with reset instructions. If password reset is not yet available in the self-service UI, contact support for assistance.Troubleshooting
Why am I getting a 401 Unauthorized error?
A 401 error means your API key is missing, invalid, or has been revoked. Check that you are sending theAuthorization: Bearer header with a valid key from your dashboard. Verify the key has not been deleted.
Why am I getting a 429 Too Many Requests error?
You have exceeded your RPM (per-minute) or RPD (per-day) rate limit. Wait for the current window to reset, or upgrade your plan for higher limits. Your current usage is visible in the dashboard.Why am I getting a 503 No available providers error?
The requested model is temporarily unavailable — upstream providers may be experiencing errors or capacity limits. This is usually short-lived. Check the Models page to see current availability.Why does Cursor’s Verify button fail?
Cursor’s Verify button callsGET /v1/models. Make sure your base URL is set to https://api.openference.com/v1 (include the /v1 path) and your API key is correct. If the key has model restrictions, only restricted models will appear in the list.
See Verify issues.
Why am I seeing unexpected models or missing models?
If your API key has model restrictions, only those models are available. Check your key settings in the dashboard. Also, theGET /v1/models endpoint filters results based on your key’s permissions.
Streaming responses are not working. What should I check?
Ensure you are settingstream: true in your request and that your client supports Server-Sent Events (SSE). Some clients require specific configuration for streaming. Verify the upstream model supports streaming.
Additional resources
- Full Client Integrations
- Models page — current model catalog and pricing
- Pricing page — subscription plans and credit packages
- Live API surface:
https://api.openference.com/v1

