Rate Limits
Documentation on API rate limits, response header interpretation, and best practices for building resilient clients.
To ensure stability, prevent abuse, and provide fair usage for all customers, rate limits are applied to certain API groups. When a limit is exceeded, the API responds with an HTTP 429 Too Many Requests error.
This guide explains how rate limits work, how headers in responses are interpreted, and best practices for handling retries.
Rate limits are applied at different scopes (e.g., per organisation, per user, per IP-Address). The "Scope" column in the table below is the definitive source for which entity a specific limit is applied to.
For example, a limit with a scope of "Per organisation" means that all requests from the entire organisation (tenant) share a single request pool for that API group.
Note: In the context of "Per organisation" limits, "organisation" and "tenant" are used interchangeably.
Current Rate Limits
If an API group is not listed here, it is not currently rate-limited. The "Scope" column defines what the limit is applied to.
| API group | Rate Limit | Scope |
|---|---|---|
| User Admin API | 600 req/minute | Per organisation |
| Legacy Public Post API | 300 req/minute | Per organisation |
Rate Limit Headers
When making requests to a rate-limited API group, standard HTTP headers are returned indicating the current limit status. The implementation relay on the RateLimit header fields for HTTP RFC Draft.
When a 429 Too Many Requests error is received, these headers indicate when to retry.
Which Limit is Active?
These headers define the current quota and remaining balance.
RateLimit-Limit: The quota(s) for the current time window(s).- Example:
RateLimit-Limit: 10;w=1, 300;w=60means the limit is 10 requests per second and 300 requests per minute.
- Example:
RateLimit-Remaining: The number of requests remaining in the window that is closest to being exhausted.RateLimit-Reset: The number of seconds until the window resets.
Determining Retry Timing
When rate-limited, the headers indicate when the next request can be sent.
Golden Rule: The
RateLimit-Remainingheader should be checked. If it is0, the number of seconds specified inRateLimit-Resetmust elapse before making a new request.
Header Examples
Example: A successful 200 OK response
This example shows 7 requests remaining in the current 60-seconds window.
Example: A 429 Too Many Requests response
This example shows 0 requests remaining. The client must wait 20 seconds (from RateLimit-Reset) before retrying.
Best Practices
Proper handling of 429 errors is critical for client stability. Retrying immediately will lead to additional errors.
The Recommended Retry Strategy: Exponential Backoff + Jitter
Simply waiting for RateLimit-Reset is insufficient. If all servers retry at the exact same moment the window resets, they will be rate-limited again (this is known as a "thundering herd").
The recommended solution is exponential backoff (waiting progressively longer after each failure) combined with jitter (adding a small, random delay to break up synchronized retries).
Client-side retry logic should implement the following:
-
Check Headers: When a 429 is received, the
RateLimit-Resetheader should be read. This represents the minimum server-required wait time (ServerWaitSeconds). -
Calculate Client Backoff: On each consecutive failure, the client's own wait time should be doubled (e.g., 1s, 2s, 4s, 8s...). This is the "exponential backoff."
-
Add Jitter: A small, random amount of time (e.g., +/- 20%) should be added to this backoff period. This prevents all clients from retrying at the same microsecond.
-
Wait and Retry: The client should wait for the longest of (ServerWaitSeconds) or (ClientBackoff + Jitter).
-
Cap Retries: Retries should stop after a maximum number of attempts (e.g., 10) to avoid infinite loops.
-
Reset: After a successful request, the retry attempt counter should be reset back to zero.
Be Proactive, Not Just Reactive: Clients should not only react to 429s. The RateLimit-Remaining header should be read on successful 200 OK responses. If the remaining count is low (e.g., < 10%), the request rate should be proactively slowed down.