repryntt

documentation · BYOK setup

Bring your own LLM key

repryntt does not mark up tokens. You connect provider keys; we route each task to the right model. You pay the provider directly.

Why BYOK

The $29/mo Pro subscription covers the platform: hosted dashboard, workbenches, integrations, support. Inference cost stays between you and the provider — the same dollar of OpenAI credit buys you the same number of tokens whether you call them from repryntt or directly.

If a provider raises prices, you see it immediately. If they cut prices, you see that too. No surprise platform invoices.

Connect a provider

  1. 1. Go to Dashboard → Settings → LLM Providers.
  2. 2.Click "Add provider" and pick one from the list.
  3. 3. Paste your API key. We encrypt-at-rest before persisting. You can rotate or revoke it anytime.
  4. 4.Optional: set a per-provider monthly budget. We'll halt routing to that provider when the budget is hit.

Supported providers

ProviderEnv var (OSS)Where to get a key
AnthropicANTHROPIC_API_KEYconsole.anthropic.com/settings/keys
OpenAIOPENAI_API_KEYplatform.openai.com/api-keys
xAIXAI_API_KEYconsole.x.ai
GoogleGOOGLE_API_KEYaistudio.google.com/apikey
NVIDIA NIMNVIDIA_API_KEYbuild.nvidia.com
Local (Ollama / vLLM)REPRYNTT_LOCAL_LLM_URLself-hosted (e.g. http://localhost:11434/v1)

Provider notes

Anthropic

claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5

Recommended default for reasoning-heavy agents. Workbench critics use Opus by default.

OpenAI

gpt-4o, gpt-4o-mini, o1, o3

Strong general-purpose router target. Cheapest tier for high-volume routine jobs.

xAI

grok-2, grok-2-mini, grok-vision

Optional. Useful for long-context and real-time-data agents.

Google

gemini-2.5-pro, gemini-2.5-flash

Gemini 2.5 Flash is the cheapest frontier vision model — used for tiered vision in robotics.

NVIDIA NIM

Llama 3.x, Mixtral, Nemotron

Hosted open models. Often the cheapest path for high-throughput producer jobs.

Local (Ollama / vLLM)

any OpenAI-compatible local server

Fully offline. Set the env var to your local OpenAI-compatible endpoint and pick a model in Settings.

Multi-provider routing

Connect more than one provider and the router picks per task: cheap models for cheap work, frontier for hard. You can pin specific agents to specific providers in Settings → Routing.

The Coherence Cloud critic always uses a frontier judge regardless of producer routing — that's the whole point of an independent reviewer.

OSS / self-host

Running the OSS framework locally? Drop the env vars in your shell or in ~/.repryntt/brain/ai_config.json:

{
  "providers": {
    "anthropic": { "api_key": "sk-ant-..." },
    "openai":    { "api_key": "sk-..." },
    "local":     { "base_url": "http://localhost:11434/v1", "model": "llama3" }
  }
}

Security

  • Keys are encrypted-at-rest with a per-account KMS key. Plaintext only exists in memory during a request.
  • You can revoke a key from the dashboard at any time; future requests fail immediately.
  • Set per-provider budgets to cap blast radius if a key leaks.
  • Self-hosting: keys never leave your machine. Self-hosting guide →