Rate Limits
Lumyx AI implements rate limiting to ensure fair usage and maintain service quality. Learn how rate limits work and how to handle them in your applications.
How Rate Limits Work
Lumyx AI uses two types of rate limits to control API usage:
Per-Minute Limit
Maximum number of requests allowed per minute. Resets every 60 seconds.
Daily Limit
Maximum number of requests allowed per calendar day (UTC). Resets at midnight.
You can configure custom rate limits when creating API keys. Limits can be set between 1-1000 requests per minute and 1-10000 requests per day.
In addition to rate limits, API usage is limited by your account balance. If your balance reaches zero, API requests will fail with a 402 Payment Required error regardless of your rate limit status.
Rate Limit Headers
Every API response includes headers that inform you about your current rate limit status:
Response Headers
X-RateLimit-Limit-Minute
integer
The maximum number of requests allowed per minute for your API key.
X-RateLimit-Remaining-Minute
integer
The number of requests remaining in the current minute window.
X-RateLimit-Reset-Minute
integer
Unix timestamp when the minute rate limit window resets.
X-RateLimit-Limit-Day
integer
The maximum number of requests allowed per day for your API key.
X-RateLimit-Remaining-Day
integer
The number of requests remaining in the current day.
X-RateLimit-Reset-Day
integer
Unix timestamp when the daily rate limit resets (midnight UTC).
HTTP/1.1 200 OK Content-Type: application/json X-RateLimit-Limit-Minute: 60 X-RateLimit-Remaining-Minute: 45 X-RateLimit-Reset-Minute: 1699896976 X-RateLimit-Limit-Day: 1000 X-RateLimit-Remaining-Day: 847 X-RateLimit-Reset-Day: 1699920000
When Rate Limits Are Exceeded
When you exceed your rate limit, the API returns a 429 Too Many Requests status code:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 23
{
"error": {
"message": "Rate limit exceeded. Too many requests in the current minute.",
"type": "rate_limit_exceeded",
"param": null,
"code": "rate_limit_exceeded"
}
}
Additional Headers on Rate Limit Exceeded
Retry-After
integer
Number of seconds to wait before making another request.
Handling Rate Limits in Your Code
JavaScript (with exponential backoff)
async function makeAPIRequest(url, options, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
const response = await fetch(url, options);
// Check rate limit headers
const remaining = response.headers.get('X-RateLimit-Remaining-Minute');
console.log(`Requests remaining: ${remaining}`);
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After');
const waitTime = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, attempt) * 1000;
console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
if (attempt < maxRetries) {
await new Promise(resolve => setTimeout(resolve, waitTime));
continue;
}
throw new Error('Rate limit exceeded. Max retries reached.');
}
return await response.json();
}
}
Python (with requests)
import requests
import time
def make_api_request(url, headers, data, max_retries=3):
for attempt in range(1, max_retries + 1):
response = requests.post(url, headers=headers, json=data)
remaining = response.headers.get('X-RateLimit-Remaining-Minute')
print(f"Requests remaining: {remaining}")
if response.status_code == 429:
retry_after = response.headers.get('Retry-After')
wait_time = int(retry_after) if retry_after else 2 ** attempt
print(f"Rate limited. Waiting {wait_time}s...")
if attempt < max_retries:
time.sleep(wait_time)
continue
else:
raise Exception("Rate limit exceeded")
return response.json()
Best Practices
Monitor Usage
- Always check rate limit headers in responses
- Use the analytics dashboard to track usage patterns
- Set up alerts when approaching limits
- Plan request timing to avoid peak usage
Implement Retry Logic
- Use exponential backoff for retries
- Respect the Retry-After header
- Add jitter to prevent thundering herd
- Limit maximum retry attempts
Optimize Requests
- Batch multiple operations when possible
- Cache responses when appropriate
- Use conversation_id for multi-turn chats
- Choose appropriate models for your use case
Handle Errors Gracefully
- Provide user-friendly error messages
- Log rate limit information for debugging
- Implement fallback mechanisms
- Don't retry indefinitely
Custom Rate Limits
When creating API keys, you can configure custom rate limits based on your needs:
| Use Case | Per Minute | Per Day | Recommended For |
|---|---|---|---|
| Development | 10-30 | 100-500 | Testing, prototyping, personal projects |
| Production (Small) | 60-100 | 1,000-2,500 | Small applications, MVP launches |
| Production (Medium) | 200-500 | 5,000-7,500 | Growing businesses, moderate traffic |
| Production (Large) | 750-1,000 | 10,000 | Enterprise applications, high traffic |
If you need higher rate limits than the maximum configurable values, please contact our support team to discuss enterprise options.