Rate limits
Per-token request quotas on the Qyvo REST API and the MCP server, and how to handle 429 responses.
Qyvo applies a per-token request quota on the REST API and the MCP server. The default is generous for normal use; high-volume integrations should design for backoff.
Quota
| Surface | Default limit | Window | Headers exposed |
|---|---|---|---|
/api/v1/* |
120 req / token | 60 s rolling | X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After (on 429) |
/mcp |
300 ops / token | 60 s rolling | Same |
The quota counts every authenticated request, including no-op pings.
Limits can be raised on a per-account basis for production integrations. Email [email protected] with your token name and expected throughput.
Reading the headers
Every successful response includes:
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 117
When Remaining hits 0, the next request returns 429 Too Many Requests:
HTTP/1.1 429 Too Many Requests
Retry-After: 23
Retry-After is the number of seconds to wait before retrying.
Handling 429 in code
A simple exponential-backoff wrapper handles transient throttling cleanly. Always honor Retry-After if present.
async function callQyvo(url, init = {}, attempt = 0) {
const res = await fetch(url, {
...init,
headers: {
Authorization: `Bearer ${process.env.QYVO_TOKEN}`,
...(init.headers || {}),
},
});
if (res.status === 429 && attempt < 5) {
const wait = Number(res.headers.get('Retry-After') ?? 2) * 1000;
await new Promise((r) => setTimeout(r, wait));
return callQyvo(url, init, attempt + 1);
}
return res;
}
use Illuminate\Support\Facades\Http;
$response = Http::withToken(env('QYVO_TOKEN'))
->retry(5, 0, function (\Throwable $e, $request) {
if ($e instanceof \Illuminate\Http\Client\RequestException
&& $e->response->status() === 429) {
$wait = (int) ($e->response->header('Retry-After') ?? 2);
sleep($wait);
return true;
}
return false;
})
->get('https://www.qyvo.io/api/v1/me');
import os, time, httpx
def call_qyvo(method, url, **kwargs):
headers = kwargs.pop('headers', {})
headers['Authorization'] = f"Bearer {os.environ['QYVO_TOKEN']}"
for attempt in range(5):
r = httpx.request(method, url, headers=headers, **kwargs)
if r.status_code != 429:
return r
time.sleep(int(r.headers.get('Retry-After', '2')))
r.raise_for_status()
Bulk send guidance
Sending broadcasts to thousands of contacts goes through the Broadcasts flow inside Qyvo, not raw send-template-message calls. Broadcasts are dispatched through an internal queue that respects Meta's per-number throughput tier and avoids rate-limit storms — see the dashboard or the MCP create_broadcast tool.
If you really need to send a high volume from your own code:
- Cap concurrency at 5 simultaneous requests per token
- Pace at most 30 sends per second (Meta's per-number tier may be lower)
- Use exponential backoff on
429and5xx
Limits we don't enforce yet
- No daily quota on tokens
- No request body size cap beyond Meta's own template limits
- No bandwidth cap on inbox reads
These may be added later — we'll announce them in the changelog at least 30 days before activation.
