Rate Limits

Effective Date: June 1, 2023

A rate limit is the number of API calls that can be made within a given time window. Actions such as rapidly updating configuration settings, aggressive polling, or making highly concurrent API calls may result in your account being rate limited.

All API requests are subject to rate limits. Real-time rate limit usage statistics are described in headers that are included with most API responses. When requests get blocked, responses with HTTP status 429 (Too Many Requests) are returned.

Each API endpoint, or each group of API endpoints has a rate limit policy.

Messaging API Rate Limits

EndpointRate Limit Policy
POST /v2/emails200 rps, per account
POST v2/sms200 rps, per account
POST /v2/verify/verifications200 rps, per account
POST /v2/verify/verificationChecks200 rps, per account
POST /v2/voices200 rps, per account
POST /v2/whatsapp/messages200 rps, per sender
POST /v2/whatsapp/messages/sendDirectly80 rps, per sender

Most endpoints are rate limited to 200 rps (requests per second) per account.

WhatsApp messaging API endpoints' usage statistics are measured separately, but you should not use both of them at a high rate at the same time. The /v2/whatsapp/messages endpoint can accept requests at a rapid rate of 200 rps, while YCloud submits these messages to the Meta API at a lower rate of 60 rps. If you are sending messages through the /v2/whatsapp/messages/sendDirectly endpoint, using the same sender (business phone number) at the same time, messages may fail due to the rate limit of the Meta API.

Management API Rate Limits

If not documented, most APIs are classified as Management APIs. They conform to the same rate limit policy and share the quota, which means requests made to these endpoints are measured together.

The rate limit policy is 200 rps, and 10000 rph (requests per hour), per account. Applied to the following endpoints:

  • /v2/balance
  • /v2/webhookEndpoints/*
  • /v2/whatsapp/businessAccounts/*
  • /v2/whatsapp/phoneNumbers/*
  • /v2/whatsapp/templates/*
  • and others that not documented

Retry-After Header

When you experience throttling, YCloud returns an HTTP 429 response with a Retry-After header indicating how long in seconds you should wait before retrying or making a new request.

We recommend that you at least respond to the Retry-After header. If you detect a Retry-After header in any response, wait until that amount of time has passed before sending another request. Doing so helps you experience fewer enforced delays. Keep in mind that the response is 200, so you don't need to apply retry logic to the request.

If possible, we further recommend that you monitor RateLimit-Remaining and RateLimit-Limit headers as described below.

Rate Limit Headers

In addition to the Retry-After header in the response of throttled requests, YCloud also returns the IETF RateLimit headers in certain conditions to help you manage rate limiting.

Header nameDescription
RateLimit-LimitA non-negative Integer indicating the maximum amount of activity that the server is willing to accept from what it identifies as the client (e.g., based upon your account or sender) during a time window.
RateLimit-PolicyIndicates the quota policies currently associated with the client. Its value is informative.
RateLimit-RemainingThe remaining quota units associated with the RateLimit-Limit.
For example, a quota policy of 100 quota units per minute is expressed as:
100;w=60
Other parameters are allowed and can be regarded as comments.
RateLimit-ResetThe number of seconds until the quota associated with the RateLimit-Limit resets.

Note: These headers are currently in beta and subject to change. At the time when the headers were adopted, the IETF specification was in draft. The current implementation is based on the draft-06 of the IETF specification. There is the potential for changes when the specification is final, and we will adapt to those changes in the future.

Example for throttled responses:

HTTP/2 429
Content-Type: application/json
Retry-After: 1800
RateLimit-Limit: 10000
RateLimit-Policy: 200;w=1;burst=200;algorithm=token_bucket;level=account;scope=management_api, 10000;w=3600;algorithm=fixed_window;level=account;scope=management_api
RateLimit-Remaining: 0
RateLimit-Reset: 1800

In this case, the endpoint is rate limited to 200 requests per second, and 1000 requests per hour (i.e., 3600 seconds), per account. The request is rejected due to exceeding the limit per hour. The limit resets in 1800 seconds, and you should not try again until that amount of time has passed. The rate limit policies are applied to Management APIs, so other endpoints classified as Management APIs are also involved.