Rate limits

Per-token request quotas on the Qyvo REST API and the MCP server, and how to handle 429 responses.

Qyvo applies a per-token request quota on the REST API and the MCP server. The default is generous for normal use; high-volume integrations should design for backoff.

Quota

Surface Default limit Window Headers exposed
/api/v1/* 120 req / token 60 s rolling X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After (on 429)
/mcp 300 ops / token 60 s rolling Same

The quota counts every authenticated request, including no-op pings.

Limits can be raised on a per-account basis for production integrations. Email [email protected] with your token name and expected throughput.

Reading the headers

Every successful response includes:

X-RateLimit-Limit: 120
X-RateLimit-Remaining: 117

When Remaining hits 0, the next request returns 429 Too Many Requests:

HTTP/1.1 429 Too Many Requests
Retry-After: 23

Retry-After is the number of seconds to wait before retrying.

Handling 429 in code

A simple exponential-backoff wrapper handles transient throttling cleanly. Always honor Retry-After if present.

async function callQyvo(url, init = {}, attempt = 0) {
  const res = await fetch(url, {
    ...init,
    headers: {
      Authorization: `Bearer ${process.env.QYVO_TOKEN}`,
      ...(init.headers || {}),
    },
  });

  if (res.status === 429 && attempt < 5) {
    const wait = Number(res.headers.get('Retry-After') ?? 2) * 1000;
    await new Promise((r) => setTimeout(r, wait));
    return callQyvo(url, init, attempt + 1);
  }

  return res;
}
use Illuminate\Support\Facades\Http;

$response = Http::withToken(env('QYVO_TOKEN'))
    ->retry(5, 0, function (\Throwable $e, $request) {
        if ($e instanceof \Illuminate\Http\Client\RequestException
            && $e->response->status() === 429) {
            $wait = (int) ($e->response->header('Retry-After') ?? 2);
            sleep($wait);
            return true;
        }
        return false;
    })
    ->get('https://www.qyvo.io/api/v1/me');
import os, time, httpx

def call_qyvo(method, url, **kwargs):
    headers = kwargs.pop('headers', {})
    headers['Authorization'] = f"Bearer {os.environ['QYVO_TOKEN']}"
    for attempt in range(5):
        r = httpx.request(method, url, headers=headers, **kwargs)
        if r.status_code != 429:
            return r
        time.sleep(int(r.headers.get('Retry-After', '2')))
    r.raise_for_status()

Bulk send guidance

Sending broadcasts to thousands of contacts goes through the Broadcasts flow inside Qyvo, not raw send-template-message calls. Broadcasts are dispatched through an internal queue that respects Meta's per-number throughput tier and avoids rate-limit storms — see the dashboard or the MCP create_broadcast tool.

If you really need to send a high volume from your own code:

  • Cap concurrency at 5 simultaneous requests per token
  • Pace at most 30 sends per second (Meta's per-number tier may be lower)
  • Use exponential backoff on 429 and 5xx

Limits we don't enforce yet

  • No daily quota on tokens
  • No request body size cap beyond Meta's own template limits
  • No bandwidth cap on inbox reads

These may be added later — we'll announce them in the changelog at least 30 days before activation.