Claude Code speaks exactly one protocol: Anthropic's Messages API. That is by design, and it is why pointing it at OmniaKey takes two environment variables and changes nothing else — OmniaKey serves the same protocol natively. What you gain is one key and one balance shared across every tool you run, per-token billing with no monthly plan, and a guarantee that the model id you ask for is the one that runs.

Setup

Create a key in the dashboard, set two variables, and launch:

bash

export ANTHROPIC_BASE_URL="https://api.omniakey.com"
export ANTHROPIC_AUTH_TOKEN="your-omniakey-api-key"
claude

Streaming, tool use, and long context all pass through unchanged — Claude Code still thinks it is talking to Anthropic directly.

Don't add /v1 to the base URL. Claude Code appends /v1/messages itself, so https://api.omniakey.com/v1 turns into /v1/v1/messages and returns 404. Use the bare host.

Pick a model

Switch models inside the session with /model:

text

/model claude-opus-4-8

Claude ids on OmniaKey include claude-opus-4-8 (flagship, 1M context), claude-sonnet-5 (balanced), and claude-haiku-4-5 (fast, 200K). The id you pass is the id that runs — OmniaKey never silently swaps, quantizes, or distills it.

GPT and Gemini: a protocol boundary, not a limit

Claude Code drives Claude only. That is the Anthropic protocol talking, not OmniaKey — no gateway can make Claude Code run GPT or Gemini without breaking the tool. To use those families in an agent, point an OpenAI-compatible tool (Cursor, Cline, aider, Codex, OpenClaw) at OmniaKey's OpenAI-compatible endpoint — the /v1 one below — where only the model id changes: gpt-5.5 and gemini-3.1-pro-preview run exactly like a Claude id.

OpenAI-compatible

https://api.omniakey.com/v1

Anthropic-native

https://api.omniakey.com

Gemini-native

https://api.omniakey.com/v1beta

OmniaKey exposes all three surfaces from one key. Coding tools use the OpenAI-compatible /v1; the Gemini-native /v1beta is there for Google's own Gemini SDK. The coding agents guide has the per-tool setup.

Why predictability matters in an agent

A gateway that silently reroutes to a "cheaper equivalent" under load is especially dangerous in a coding agent: a different model formats tool calls differently and reasons about your codebase differently, so a mid-task swap can corrupt a multi-file edit. With OmniaKey, if an upstream is unavailable you get the error and pick another model yourself — never a substitution behind your back.

Get an OmniaKey API key Read the quick start

Use Claude Code with OmniaKey

Claude Code setup

Setup

Pick a model

GPT and Gemini: a protocol boundary, not a limit

Why predictability matters in an agent