documentation · BYOK setup
Bring your own LLM key
repryntt does not mark up tokens. You connect provider keys; we route each task to the right model. You pay the provider directly.
Why BYOK
The $29/mo Pro subscription covers the platform: hosted dashboard, workbenches, integrations, support. Inference cost stays between you and the provider — the same dollar of OpenAI credit buys you the same number of tokens whether you call them from repryntt or directly.
If a provider raises prices, you see it immediately. If they cut prices, you see that too. No surprise platform invoices.
Connect a provider
- 1. Go to Dashboard → Settings → LLM Providers.
- 2.Click "Add provider" and pick one from the list.
- 3. Paste your API key. We encrypt-at-rest before persisting. You can rotate or revoke it anytime.
- 4.Optional: set a per-provider monthly budget. We'll halt routing to that provider when the budget is hit.
Supported providers
| Provider | Env var (OSS) | Where to get a key |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY | console.anthropic.com/settings/keys |
| OpenAI | OPENAI_API_KEY | platform.openai.com/api-keys |
| xAI | XAI_API_KEY | console.x.ai |
| GOOGLE_API_KEY | aistudio.google.com/apikey | |
| NVIDIA NIM | NVIDIA_API_KEY | build.nvidia.com |
| Local (Ollama / vLLM) | REPRYNTT_LOCAL_LLM_URL | self-hosted (e.g. http://localhost:11434/v1) |
Provider notes
Anthropic
claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5
Recommended default for reasoning-heavy agents. Workbench critics use Opus by default.
OpenAI
gpt-4o, gpt-4o-mini, o1, o3
Strong general-purpose router target. Cheapest tier for high-volume routine jobs.
xAI
grok-2, grok-2-mini, grok-vision
Optional. Useful for long-context and real-time-data agents.
gemini-2.5-pro, gemini-2.5-flash
Gemini 2.5 Flash is the cheapest frontier vision model — used for tiered vision in robotics.
NVIDIA NIM
Llama 3.x, Mixtral, Nemotron
Hosted open models. Often the cheapest path for high-throughput producer jobs.
Local (Ollama / vLLM)
any OpenAI-compatible local server
Fully offline. Set the env var to your local OpenAI-compatible endpoint and pick a model in Settings.
Multi-provider routing
Connect more than one provider and the router picks per task: cheap models for cheap work, frontier for hard. You can pin specific agents to specific providers in Settings → Routing.
The Coherence Cloud critic always uses a frontier judge regardless of producer routing — that's the whole point of an independent reviewer.
OSS / self-host
Running the OSS framework locally? Drop the env vars in your shell or in ~/.repryntt/brain/ai_config.json:
{
"providers": {
"anthropic": { "api_key": "sk-ant-..." },
"openai": { "api_key": "sk-..." },
"local": { "base_url": "http://localhost:11434/v1", "model": "llama3" }
}
}Security
- Keys are encrypted-at-rest with a per-account KMS key. Plaintext only exists in memory during a request.
- You can revoke a key from the dashboard at any time; future requests fail immediately.
- Set per-provider budgets to cap blast radius if a key leaks.
- Self-hosting: keys never leave your machine. Self-hosting guide →