Kraken API documentation
Last updated: 16 April 2026
On this page
Overview
Kraken is a managed LLM API gateway. One OpenAI-compatible endpoint routes your requests to every major LLM provider with a task-aware router that picks the best-fit model for every prompt.
If you already use OpenAI or any OpenAI-compatible SDK, switching to Kraken is a one-line change — update the base_url, keep everything else the same.
https://gammainfra.comDashboard: dashboard.gammainfra.com · Status: status.gammainfra.com · Sign up:
POST /v1/signup
Quickstart
1. Get an API key
curl -s -X POST https://gammainfra.com/v1/signup \
-H "Content-Type: application/json" \
-d '{"email": "you@example.com", "name": "Your Name"}'
{
"api_key": "sk-kraken-...",
"email": "you@example.com",
"balance_usd": 1.0,
"message": "Welcome to Kraken. Store your API key — it will not be shown again."
}
Store the key. It is shown once. Signup seeds $1.00 of free credit so you can start calling immediately.
2. Make your first call
curl -s -X POST https://gammainfra.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-kraken-..." \
-d '{
"model": "kraken/auto",
"messages": [{"role": "user", "content": "Explain transformers in one paragraph."}]
}'
The response is identical to OpenAI’s format. kraken/auto lets the router pick the best model for your prompt.
3. Drop-in replacement
from openai import OpenAI
client = OpenAI(
api_key="sk-kraken-...",
base_url="https://gammainfra.com/v1",
)
response = client.chat.completions.create(
model="kraken/auto",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)
The same pattern works with LangChain, LlamaIndex, and any other OpenAI-compatible library.
Authentication
Every request (except /v1/signup, /v1/models, and /v1/status) requires a Bearer token:
Authorization: Bearer sk-kraken-...
Keys are prefixed sk-kraken-. The plaintext is only returned on creation — Kraken stores a bcrypt hash. Create additional keys or revoke old ones from the dashboard or via POST /v1/keys / DELETE /v1/keys/{id}.
| Status | Meaning |
|---|---|
401 | Missing or invalid API key |
402 | Insufficient credits — top up and retry |
Smart routing
Send model: "kraken/auto" and the router classifies your prompt into one of 10 task types, then dispatches to the best-fit model for that type. If the primary model is unavailable, Kraken falls back through a chain of 3–4 models automatically.
| Task type | When it fires |
|---|---|
tool_use | Request includes tools or function calls |
multimodal | A message contains an image |
code_gen | Keywords: function, code, implement, debug, refactor, regex, unit test |
math | Keywords: solve, calculate, equation, integral, prove, probability |
reasoning | Keywords: explain why, analyse, compare, evaluate, strategy, root cause |
creative | Keywords: poem, story, essay, brainstorm, lyrics, rewrite |
translation | Keywords: translate, localize, in/to spanish/french/japanese/… |
extraction | Keywords: extract, parse, classify, format as json, list all, sentiment |
summarisation | Keywords: summarise, tldr, key points, brief, condense |
chat | Default when nothing else matches |
X-Kraken-Preference: quality (default), cost, or latency to bias the router.Want to opt out? Send
X-Kraken-Routing: off and Kraken will route straight to the exact model you named in model.
Model names
Smart aliases (recommended)
| Model name | Behaviour |
|---|---|
kraken/auto | Picks the best-fit model for your prompt type |
kraken/fast | Optimises for lowest latency (equivalent to X-Kraken-Preference: latency) |
kraken/cheap | Optimises for lowest cost (equivalent to X-Kraken-Preference: cost) |
Pin a specific model
Prefix any model with its provider slug:
openai/gpt-5.4
openai/gpt-5.4-mini
openai/gpt-5.4-nano
openai/gpt-5-mini
anthropic/claude-opus-4-6
anthropic/claude-sonnet-4-6
anthropic/claude-haiku-4-5
google/gemini-3.1-pro-preview
google/gemini-3-flash-preview
google/gemini-2.5-pro
google/gemini-2.5-flash
mistral/mistral-large-2512
mistral/mistral-small-2603
mistral/codestral-2508
mistral/devstral-2512
groq/llama-3.3-70b-versatile
groq/llama-3.1-8b-instant
groq/gpt-oss-120b
deepseek/deepseek-chat
deepseek/deepseek-reasoner
grok/grok-4
grok/grok-4-fast
grok/grok-code-fast-1
For the full, authoritative list:
curl -s https://gammainfra.com/v1/models | jq .
Streaming
Streaming works exactly like OpenAI — set stream: true and read Server-Sent Events. All providers are normalised to the OpenAI SSE format, so your existing code works unchanged.
stream = client.chat.completions.create(
model="kraken/auto",
messages=[{"role": "user", "content": "Write a haiku about distributed systems."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
Headers
Request headers
| Header | Value | Purpose |
|---|---|---|
Authorization | Bearer sk-kraken-… | Required |
Content-Type | application/json | Required |
X-Kraken-Routing | off | Disable smart routing; use the exact model you named |
X-Kraken-Preference | quality (default) / cost / latency | Bias the router when using kraken/auto |
Response headers
| Header | Meaning |
|---|---|
X-Kraken-Request-Id | Correlation ID — include when filing a support request |
X-Kraken-Provider | Which provider served the response (e.g. openai, anthropic) |
X-Kraken-Router-Version | v1 today; v2 once the ML router goes live |
X-Kraken-Logical-Model | Router v2 only — the logical bucket the router picked |
Credits & pricing
- Credits are USD-denominated. $1.00 = 100 credits.
- New accounts start with $1.00 of free credit.
- Kraken charges provider cost + 2%.
- No subscription. Credits never expire.
Approximate cost per 1M tokens
| Model | Input | Output |
|---|---|---|
kraken/auto (chat default) | ~$0.10 | ~$0.60 |
openai/gpt-5.4 | $2.00 | $8.00 |
openai/gpt-5.4-mini | $0.40 | $1.60 |
anthropic/claude-opus-4-6 | $5.00 | $25.00 |
anthropic/claude-sonnet-4-6 | $3.00 | $15.00 |
google/gemini-3.1-pro-preview | $1.25 | $5.00 |
google/gemini-3-flash-preview | $0.30 | $2.50 |
deepseek/deepseek-chat | $0.28 | $0.42 |
groq/llama-3.1-8b-instant | $0.06 | $0.08 |
Costs above are provider list prices. You pay those plus the 2% Kraken fee. For the full cost table and legal terms, see the Terms of Service.
Check your balance
curl -s https://gammainfra.com/v1/billing/balance \
-H "Authorization: Bearer sk-kraken-..."
{"balance_usd": 0.97, "customer_id": "..."}
Top up
Top up your balance with Stripe. Card data is handled by Stripe — Kraken never sees it.
From the dashboard
Go to dashboard.gammainfra.com → Top up and choose an amount. You’ll be redirected to Stripe’s hosted checkout and back to your dashboard once payment clears.
From the API
curl -s -X POST https://gammainfra.com/v1/billing/checkout \
-H "Authorization: Bearer sk-kraken-..." \
-H "Content-Type: application/json" \
-d '{"amount_usd": 25.0}'
{
"checkout_url": "https://checkout.stripe.com/c/pay/cs_live_...",
"session_id": "cs_live_...",
"amount_usd": 25.0
}
Open checkout_url in a browser and pay. Amount range: $5 – $1000. Your balance updates within seconds of Stripe’s confirmation.
Bring your own key (BYOK)
Optional. By default Kraken uses its own provider API keys on your behalf — one Kraken key, every model. If you already have a direct relationship with a provider, add your own key and Kraken will route requests to that provider through your key instead.
- Your key is stored encrypted at rest (Fernet).
- You pay that provider directly for requests that use your key; the Kraken fee still applies.
- Revoking or deleting your key falls back to managed routing (if the provider offers managed access) or skips that provider.
- Supported providers:
openai,anthropic,google,mistral,groq,deepseek,grok.
Add a key
curl -s -X POST https://gammainfra.com/v1/provider-keys \
-H "Authorization: Bearer sk-kraken-..." \
-H "Content-Type: application/json" \
-d '{"provider_name": "openai", "api_key": "sk-..."}'
List your keys
curl -s https://gammainfra.com/v1/provider-keys \
-H "Authorization: Bearer sk-kraken-..."
Delete a key
curl -s -X DELETE https://gammainfra.com/v1/provider-keys/openai \
-H "Authorization: Bearer sk-kraken-..."
Or manage all of this from dashboard.gammainfra.com → Provider Keys.
Error codes
Error responses use a consistent JSON shape:
{
"error": {
"message": "Human-readable description",
"type": "error_type",
"code": "machine_readable_code",
"request_id": "uuid"
}
}
| Status | Code | Meaning |
|---|---|---|
401 | — | Missing or invalid API key |
402 | insufficient_credits | Account balance can’t cover the request |
409 | email_exists | Email already registered (signup) |
422 | — | Invalid request body |
429 | signup_rate_limited | Too many signup attempts — 3/hour/IP |
503 | providers_down | All providers in the fallback chain failed |
X-Kraken-Request-Id from the response headers if you file a support ticket.
Rate limits
- Signup: 3 attempts per IP per hour. 4th attempt returns
429 signup_rate_limited. - Chat completions: no hard Kraken-side limit today. Provider-side rate limits pass through as
429responses with anyRetry-Afterheader the provider returned. - If you expect high-volume traffic, email hello@gammainfra.com ahead of time.
Status
Live per-provider uptime, latency, and error counts are published:
- status.gammainfra.com — human-readable HTML dashboard, auto-refreshes every 30 s
GET /v1/status— same data as JSON, safe to poll from your own monitoring
Both endpoints are public (no auth). Each provider is marked operational, degraded, or outage based on the rolling 24 h request log plus a live health-check ping.
Dashboard
The dashboard at dashboard.gammainfra.com is a single-page app with four tabs:
- Balance — current credit balance, ledger history, top-up button.
- API keys — list, create, and revoke keys.
- Usage — requests/day, requests/model, requests/task-type aggregations.
- Provider keys — add/update/remove BYOK provider API keys.
Support
Email hello@gammainfra.com. Include the X-Kraken-Request-Id response header from any failing request — it lets us trace the exact path the request took through the router.