Openference adds only a few milliseconds of overhead for routing, auth, rate-limit checks, and optional format conversion.The large majority of latency comes from the upstream provider.We transparently translate request/response shapes between OpenAI, Anthropic, and Gemini so you can use the client format you prefer with any model.