Hermes Agent + OmniaKey: a custom OpenAI-compatible endpoint
Point Nous Research's Hermes Agent at OmniaKey with one custom endpoint — `hermes model` or a few lines of config.yaml, and Claude, GPT, and Gemini all answer to one key.
Hermes Agent (from Nous Research) ships with built-in providers, but it's designed to talk to any OpenAI-compatible endpoint. That makes OmniaKey a clean fit: one custom endpoint, and the same key reaches Claude, GPT, and Gemini — you switch model by id, not by reconfiguring.
The fast path: hermes model
The quickest setup is the interactive picker:
hermes model
Choose Custom endpoint (self-hosted / VLLM / etc.), then enter:
- Base URL:
https://api.omniakey.com/v1 - API key: your OmniaKey key
- Model: e.g.
claude-opus-4-8
End the base URL at /v1. Hermes appends /chat/completions itself, so a URL that already includes the full path — or a trailing slash — is the usual cause of a 404.
Or by hand: config.yaml
Prefer to edit it directly, or want a persistent multi-model setup? Put the same thing in ~/.hermes/config.yaml instead:
model:
provider: custom
base_url: https://api.omniakey.com/v1
api_key: your-omniakey-api-key
default: claude-opus-4-8
models:
- claude-opus-4-8
- gpt-5.5
- gemini-3.1-pro-preview
provider: custom is what tells Hermes to call your endpoint directly with the key above, instead of one of its built-in providers. The models: list is what populates the /model picker — restart Hermes once after editing, and you can switch between claude-opus-4-8, gpt-5.5, and gemini-3.1-pro-preview without leaving the session.
One key, three families
Because OmniaKey routes by model id over the OpenAI-compatible surface, a single custom endpoint covers all three families — no second provider block, no juggling base URLs. Billing is per token against one prepaid balance, with no monthly plan. And the model id you set is the one that runs: no silent fallback to a cheaper "equivalent" that would reason about your codebase differently mid-task.
The coding agents guide covers the other tools.