Limited time · same models — GPT 95% off, Claude 70% off
Blog
Guide

Hermes Agent + OmniaKey: a custom OpenAI-compatible endpoint

Point Nous Research's Hermes Agent at OmniaKey with one custom endpoint — `hermes model` or a few lines of config.yaml, and Claude, GPT, and Gemini all answer to one key.

4 min readOmniaKey
HermesNous Researchcoding agentsetup

Hermes Agent (from Nous Research) ships with built-in providers, but it's designed to talk to any OpenAI-compatible endpoint. That makes OmniaKey a clean fit: one custom endpoint, and the same key reaches Claude, GPT, and Gemini — you switch model by id, not by reconfiguring.

The fast path: hermes model

The quickest setup is the interactive picker:

bash
hermes model

Choose Custom endpoint (self-hosted / VLLM / etc.), then enter:

  • Base URL: https://api.omniakey.com/v1
  • API key: your OmniaKey key
  • Model: e.g. claude-opus-4-8

End the base URL at /v1. Hermes appends /chat/completions itself, so a URL that already includes the full path — or a trailing slash — is the usual cause of a 404.

Or by hand: config.yaml

Prefer to edit it directly, or want a persistent multi-model setup? Put the same thing in ~/.hermes/config.yaml instead:

yaml
model:
  provider: custom
  base_url: https://api.omniakey.com/v1
  api_key: your-omniakey-api-key
  default: claude-opus-4-8
models:
  - claude-opus-4-8
  - gpt-5.5
  - gemini-3.1-pro-preview

provider: custom is what tells Hermes to call your endpoint directly with the key above, instead of one of its built-in providers. The models: list is what populates the /model picker — restart Hermes once after editing, and you can switch between claude-opus-4-8, gpt-5.5, and gemini-3.1-pro-preview without leaving the session.

One key, three families

Because OmniaKey routes by model id over the OpenAI-compatible surface, a single custom endpoint covers all three families — no second provider block, no juggling base URLs. Billing is per token against one prepaid balance, with no monthly plan. And the model id you set is the one that runs: no silent fallback to a cheaper "equivalent" that would reason about your codebase differently mid-task.

OpenAI-compatible
https://api.omniakey.com/v1
Anthropic-native
https://api.omniakey.com
Gemini-native
https://api.omniakey.com/v1beta

The coding agents guide covers the other tools.