documentation · BYOK setup

Bring your own LLM key

repryntt does not mark up tokens. You connect provider keys; we route each task to the right model. You pay the provider directly.

Why BYOK

The $19/mo Pro subscription covers the platform: hosted dashboard, workbenches, integrations, support. Inference cost stays between you and the provider — the same dollar of OpenAI credit buys you the same number of tokens whether you call them from repryntt or directly.

If a provider raises prices, you see it immediately. If they cut prices, you see that too. No surprise platform invoices.

Connect a provider

1. Go to Dashboard → Settings → LLM Providers.
2.Click "Add provider" and pick one from the list.
3. Paste your API key. We encrypt-at-rest before persisting. You can rotate or revoke it anytime.
4.Optional: set a per-provider monthly budget. We'll halt routing to that provider when the budget is hit.

Supported providers

Provider	Env var (OSS)	Where to get a key
Anthropic	ANTHROPIC_API_KEY	console.anthropic.com/settings/keys
OpenAI	OPENAI_API_KEY	platform.openai.com/api-keys
xAI	XAI_API_KEY	console.x.ai
Google	GOOGLE_API_KEY	aistudio.google.com/apikey
NVIDIA NIM	NVIDIA_API_KEY	build.nvidia.com
Local (Ollama / vLLM)	REPRYNTT_LOCAL_LLM_URL	self-hosted (e.g. http://localhost:11434/v1)

Provider notes

Anthropic

claude-fable-5, claude-opus-4-8, claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5

Recommended default for reasoning-heavy agents. Workbench critics use Opus by default.

OpenAI

gpt-4o, gpt-4o-mini, o1, o3

Strong general-purpose router target. Cheapest tier for high-volume routine jobs.

xAI

grok-2, grok-2-mini, grok-vision

Optional. Useful for long-context and real-time-data agents.

Google

gemini-2.5-pro, gemini-2.5-flash

Gemini 2.5 Flash is the cheapest frontier vision model — used for tiered vision in robotics.

NVIDIA NIM

Llama 3.x, Mixtral, Nemotron

Hosted open models. Often the cheapest path for high-throughput producer jobs.

Local (Ollama / vLLM)

any OpenAI-compatible local server

Fully offline. Set the env var to your local OpenAI-compatible endpoint and pick a model in Settings.

Multi-provider routing

Connect more than one provider and the router picks per task: cheap models for cheap work, frontier for hard. You can pin specific agents to specific providers in Settings → Routing.

The Coherence Cloud critic always uses a frontier judge regardless of producer routing — that's the whole point of an independent reviewer.

OSS / self-host

Running the OSS framework locally? Drop the env vars in your shell or in ~/.repryntt/brain/ai_config.json:

{
  "providers": {
    "anthropic": { "api_key": "sk-ant-..." },
    "openai":    { "api_key": "sk-..." },
    "local":     { "base_url": "http://localhost:11434/v1", "model": "llama3" }
  }
}

Security

Keys are encrypted-at-rest with a per-account KMS key. Plaintext only exists in memory during a request.
You can revoke a key from the dashboard at any time; future requests fail immediately.
Set per-provider budgets to cap blast radius if a key leaks.
Self-hosting: keys never leave your machine. Self-hosting guide →