Kraken API documentation

Last updated: 16 April 2026

On this page

Overview

Kraken is a managed LLM API gateway. One OpenAI-compatible endpoint routes your requests to every major LLM provider with a task-aware router that picks the best-fit model for every prompt.

If you already use OpenAI or any OpenAI-compatible SDK, switching to Kraken is a one-line change — update the base_url, keep everything else the same.

Base URL: https://gammainfra.com
Dashboard: dashboard.gammainfra.com · Status: status.gammainfra.com · Sign up: POST /v1/signup

Quickstart

1. Get an API key

curl -s -X POST https://gammainfra.com/v1/signup \
  -H "Content-Type: application/json" \
  -d '{"email": "you@example.com", "name": "Your Name"}'
{
  "api_key": "sk-kraken-...",
  "email": "you@example.com",
  "balance_usd": 1.0,
  "message": "Welcome to Kraken. Store your API key — it will not be shown again."
}

Store the key. It is shown once. Signup seeds $1.00 of free credit so you can start calling immediately.

2. Make your first call

curl -s -X POST https://gammainfra.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-kraken-..." \
  -d '{
    "model": "kraken/auto",
    "messages": [{"role": "user", "content": "Explain transformers in one paragraph."}]
  }'

The response is identical to OpenAI’s format. kraken/auto lets the router pick the best model for your prompt.

3. Drop-in replacement

from openai import OpenAI

client = OpenAI(
    api_key="sk-kraken-...",
    base_url="https://gammainfra.com/v1",
)

response = client.chat.completions.create(
    model="kraken/auto",
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)

The same pattern works with LangChain, LlamaIndex, and any other OpenAI-compatible library.

Authentication

Every request (except /v1/signup, /v1/models, and /v1/status) requires a Bearer token:

Authorization: Bearer sk-kraken-...

Keys are prefixed sk-kraken-. The plaintext is only returned on creation — Kraken stores a bcrypt hash. Create additional keys or revoke old ones from the dashboard or via POST /v1/keys / DELETE /v1/keys/{id}.

StatusMeaning
401Missing or invalid API key
402Insufficient credits — top up and retry

Smart routing

Send model: "kraken/auto" and the router classifies your prompt into one of 10 task types, then dispatches to the best-fit model for that type. If the primary model is unavailable, Kraken falls back through a chain of 3–4 models automatically.

Task typeWhen it fires
tool_useRequest includes tools or function calls
multimodalA message contains an image
code_genKeywords: function, code, implement, debug, refactor, regex, unit test
mathKeywords: solve, calculate, equation, integral, prove, probability
reasoningKeywords: explain why, analyse, compare, evaluate, strategy, root cause
creativeKeywords: poem, story, essay, brainstorm, lyrics, rewrite
translationKeywords: translate, localize, in/to spanish/french/japanese/…
extractionKeywords: extract, parse, classify, format as json, list all, sentiment
summarisationKeywords: summarise, tldr, key points, brief, condense
chatDefault when nothing else matches
Prefer a trade-off? Send X-Kraken-Preference: quality (default), cost, or latency to bias the router.
Want to opt out? Send X-Kraken-Routing: off and Kraken will route straight to the exact model you named in model.

Model names

Smart aliases (recommended)

Model nameBehaviour
kraken/autoPicks the best-fit model for your prompt type
kraken/fastOptimises for lowest latency (equivalent to X-Kraken-Preference: latency)
kraken/cheapOptimises for lowest cost (equivalent to X-Kraken-Preference: cost)

Pin a specific model

Prefix any model with its provider slug:

openai/gpt-5.4
openai/gpt-5.4-mini
openai/gpt-5.4-nano
openai/gpt-5-mini
anthropic/claude-opus-4-6
anthropic/claude-sonnet-4-6
anthropic/claude-haiku-4-5
google/gemini-3.1-pro-preview
google/gemini-3-flash-preview
google/gemini-2.5-pro
google/gemini-2.5-flash
mistral/mistral-large-2512
mistral/mistral-small-2603
mistral/codestral-2508
mistral/devstral-2512
groq/llama-3.3-70b-versatile
groq/llama-3.1-8b-instant
groq/gpt-oss-120b
deepseek/deepseek-chat
deepseek/deepseek-reasoner
grok/grok-4
grok/grok-4-fast
grok/grok-code-fast-1

For the full, authoritative list:

curl -s https://gammainfra.com/v1/models | jq .

Streaming

Streaming works exactly like OpenAI — set stream: true and read Server-Sent Events. All providers are normalised to the OpenAI SSE format, so your existing code works unchanged.

stream = client.chat.completions.create(
    model="kraken/auto",
    messages=[{"role": "user", "content": "Write a haiku about distributed systems."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Headers

Request headers

HeaderValuePurpose
AuthorizationBearer sk-kraken-…Required
Content-Typeapplication/jsonRequired
X-Kraken-RoutingoffDisable smart routing; use the exact model you named
X-Kraken-Preferencequality (default) / cost / latencyBias the router when using kraken/auto

Response headers

HeaderMeaning
X-Kraken-Request-IdCorrelation ID — include when filing a support request
X-Kraken-ProviderWhich provider served the response (e.g. openai, anthropic)
X-Kraken-Router-Versionv1 today; v2 once the ML router goes live
X-Kraken-Logical-ModelRouter v2 only — the logical bucket the router picked

Credits & pricing

Approximate cost per 1M tokens

ModelInputOutput
kraken/auto (chat default)~$0.10~$0.60
openai/gpt-5.4$2.00$8.00
openai/gpt-5.4-mini$0.40$1.60
anthropic/claude-opus-4-6$5.00$25.00
anthropic/claude-sonnet-4-6$3.00$15.00
google/gemini-3.1-pro-preview$1.25$5.00
google/gemini-3-flash-preview$0.30$2.50
deepseek/deepseek-chat$0.28$0.42
groq/llama-3.1-8b-instant$0.06$0.08

Costs above are provider list prices. You pay those plus the 2% Kraken fee. For the full cost table and legal terms, see the Terms of Service.

Check your balance

curl -s https://gammainfra.com/v1/billing/balance \
  -H "Authorization: Bearer sk-kraken-..."
{"balance_usd": 0.97, "customer_id": "..."}

Top up

Top up your balance with Stripe. Card data is handled by Stripe — Kraken never sees it.

From the dashboard

Go to dashboard.gammainfra.comTop up and choose an amount. You’ll be redirected to Stripe’s hosted checkout and back to your dashboard once payment clears.

From the API

curl -s -X POST https://gammainfra.com/v1/billing/checkout \
  -H "Authorization: Bearer sk-kraken-..." \
  -H "Content-Type: application/json" \
  -d '{"amount_usd": 25.0}'
{
  "checkout_url": "https://checkout.stripe.com/c/pay/cs_live_...",
  "session_id": "cs_live_...",
  "amount_usd": 25.0
}

Open checkout_url in a browser and pay. Amount range: $5 – $1000. Your balance updates within seconds of Stripe’s confirmation.

Bring your own key (BYOK)

Optional. By default Kraken uses its own provider API keys on your behalf — one Kraken key, every model. If you already have a direct relationship with a provider, add your own key and Kraken will route requests to that provider through your key instead.

Add a key

curl -s -X POST https://gammainfra.com/v1/provider-keys \
  -H "Authorization: Bearer sk-kraken-..." \
  -H "Content-Type: application/json" \
  -d '{"provider_name": "openai", "api_key": "sk-..."}'

List your keys

curl -s https://gammainfra.com/v1/provider-keys \
  -H "Authorization: Bearer sk-kraken-..."

Delete a key

curl -s -X DELETE https://gammainfra.com/v1/provider-keys/openai \
  -H "Authorization: Bearer sk-kraken-..."

Or manage all of this from dashboard.gammainfra.comProvider Keys.

Error codes

Error responses use a consistent JSON shape:

{
  "error": {
    "message": "Human-readable description",
    "type": "error_type",
    "code": "machine_readable_code",
    "request_id": "uuid"
  }
}
StatusCodeMeaning
401Missing or invalid API key
402insufficient_creditsAccount balance can’t cover the request
409email_existsEmail already registered (signup)
422Invalid request body
429signup_rate_limitedToo many signup attempts — 3/hour/IP
503providers_downAll providers in the fallback chain failed
Got a 503? It means every model in the fallback chain for that task type errored at the same time — usually transient. Retry with exponential backoff. Include X-Kraken-Request-Id from the response headers if you file a support ticket.

Rate limits

Status

Live per-provider uptime, latency, and error counts are published:

Both endpoints are public (no auth). Each provider is marked operational, degraded, or outage based on the rolling 24 h request log plus a live health-check ping.

Dashboard

The dashboard at dashboard.gammainfra.com is a single-page app with four tabs:

Support

Email hello@gammainfra.com. Include the X-Kraken-Request-Id response header from any failing request — it lets us trace the exact path the request took through the router.

For policy and billing terms, see Terms and Privacy.