Rate limits
Voxy applies two independent limits to every key: a burst budget that smooths short spikes, and a daily quota that caps long-tail volume. Both return 429 Too Many Requests when exhausted. Pair this with idempotency keys so the retry loop never duplicates a side effect.
Per-key budget
Each API key gets 600 requests per minute by default, shared across all endpoints. Bursts above the limit are rejected immediately — Voxy does not queue.
Need more? Enterprise plans lift the per-minute ceiling on request. The per-workspace dialing concurrency cap is separate; it's governed by your plan's channel allowance, not the API budget.
Response headers
Every API response (success and failure) advertises the current budget via the IETF RateLimit-* headers:
HTTP/1.1 200 OK
RateLimit-Limit: 600
RateLimit-Remaining: 593
RateLimit-Reset: 42
RateLimit-Policy: 600;w=60| Header | Meaning |
|---|---|
RateLimit-Limit | The budget cap for the current window. |
RateLimit-Remaining | Calls left in the current window. |
RateLimit-Reset | Seconds until the window resets to full. |
RateLimit-Policy | Machine-readable cap + window (e.g. 600;w=60 = 600 req / 60 s). |
Daily quotas
On top of the per-minute budget, each key carries an optional daily quota — set at key creation, defaults to unlimited. Quotas count completed (2xx and 4xx) requests against a UTC-midnight window; 5xx responses do not consume quota. When exhausted, subsequent requests return 429 QUOTA_EXCEEDED until the next UTC midnight rolls over.
Quota counters are visible in the workspace UI under Settings → API keys; the same data is queryable via the upcoming /v1/keys/:id/usage endpoint (scope usage:read).
Handling 429
A 429 always carries a Retry-After header (in seconds). Honor it — retrying earlier will just be re-throttled.
HTTP/1.1 429 Too Many Requests
Retry-After: 12
Content-Type: application/json
{
"success": false,
"code": "RATE_LIMITED",
"error": "Rate limit exceeded. Retry in 12 seconds.",
"details": { "limit": 600, "windowSeconds": 60 }
}The minimal Node retry loop:
async function callWithRetry(fn) {
while (true) {
const res = await fn();
if (res.status !== 429) return res;
const retryAfter = Number(res.headers.get('retry-after') ?? '1');
await new Promise((r) => setTimeout(r, retryAfter * 1000));
}
}Distinguish the two 429 shapes by the code field: RATE_LIMITED means slow down and the same key will work again in seconds; QUOTA_EXCEEDED means wait until tomorrow or upgrade the plan. The official TypeScript SDK implements this retry policy out of the box.