Latency & Conversion

Openference adds only a few milliseconds of overhead for routing, auth, rate-limit checks, and optional format conversion. The large majority of latency comes from the upstream provider. We transparently translate request/response shapes between OpenAI, Anthropic, and Gemini so you can use the client format you prefer with any model.