Skip to content

Rate Limits

Overview

The Pontotel API implements rate limiting to ensure availability and performance for all clients.

What is Rate Limiting?

Rate limiting is a technique that limits the number of requests a client can make in a specific time period.

Current Limits

Environment Limit Window Burst
Sandbox 1000 requests 1 hour 100/min
Production 500 requests 1 hour 50/min

Rate Limit Headers

Each API response includes informational headers:

HTTP
1
2
3
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 487
X-RateLimit-Reset: 1642089600
Header Description
X-RateLimit-Limit Total request limit
X-RateLimit-Remaining Remaining requests
X-RateLimit-Reset Unix timestamp when limit resets

429 Response (Too Many Requests)

When the limit is exceeded:

JSON
1
2
3
4
5
6
7
8
{
  "error": "rate_limit_exceeded",
  "message": "You have exceeded the request limit",
  "limit": 500,
  "remaining": 0,
  "reset_at": "2025-02-09T15:00:00Z",
  "retry_after": 3600
}

Status Code: 429 Too Many Requests

Additional header:

HTTP
Retry-After: 3600

Best Practices

1. Monitor Headers

Python
def make_request_with_monitoring(url, headers):
    response = requests.get(url, headers=headers)

    # Check rate limit
    limit = int(response.headers.get('X-RateLimit-Limit', 0))
    remaining = int(response.headers.get('X-RateLimit-Remaining', 0))

    print(f"Rate Limit: {remaining}/{limit} requests remaining")

    # Alert when close to limit
    if remaining < limit * 0.1:  # Less than 10%
        print("⚠️ WARNING: Close to rate limit!")

    return response
JavaScript
async function makeRequestWithMonitoring(url, headers) {
  const response = await fetch(url, { headers });

  const limit = parseInt(response.headers.get('X-RateLimit-Limit') || '0');
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '0');

  console.log(`Rate Limit: ${remaining}/${limit} requests remaining`);

  if (remaining < limit * 0.1) {
    console.warn('⚠️ WARNING: Close to rate limit!');
  }

  return response;
}

2. Implement Retry with Backoff

Python
import time
from datetime import datetime

def request_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            reset_time = datetime.fromtimestamp(
                int(response.headers.get('X-RateLimit-Reset', 0))
            )

            print(f"⏳ Rate limit reached. Waiting {retry_after}s...")
            print(f"   Expected reset: {reset_time}")

            time.sleep(retry_after)
            continue

        return response

    raise Exception("Maximum retries exceeded")
JavaScript
async function requestWithRetry(url, headers, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, { headers });

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      const resetTime = new Date(
        parseInt(response.headers.get('X-RateLimit-Reset') || '0') * 1000
      );

      console.log(`⏳ Rate limit reached. Waiting ${retryAfter}s...`);
      console.log(`   Expected reset: ${resetTime.toLocaleString()}`);

      await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
      continue;
    }

    return response;
  }

  throw new Error('Maximum retries exceeded');
}

3. Response Caching

Reduce requests by caching responses:

Python
from datetime import datetime, timedelta

class PontotelClient:
    def __init__(self):
        self.cache = {}
        self.cache_duration = timedelta(minutes=5)

    def get_with_cache(self, url, headers):
        # Check cache
        if url in self.cache:
            cached_data, cached_time = self.cache[url]
            if datetime.now() - cached_time < self.cache_duration:
                print("✅ Returning from cache")
                return cached_data

        # Make request
        response = requests.get(url, headers=headers)

        # Store in cache
        self.cache[url] = (response.json(), datetime.now())

        return response.json()

4. Batch Requests

Group multiple operations when possible:

Python
1
2
3
4
5
6
# ❌ DON'T DO: Multiple individual requests
for user_id in user_ids:
    get_user(user_id)  # 100 requests!

# ✅ DO: One request with filters
get_users(ids=",".join(user_ids))  # 1 request

5. Efficient Pagination

Use pagination to avoid unnecessary requests:

Python
def list_all_users(base_url, headers):
    all_users = []
    url = f"{base_url}/usuarios/"

    while url:
        response = requests.get(url, headers=headers)
        data = response.json()

        all_users.extend(data['results'])
        url = data['next']  # Next page or None

        print(f"Processed: {len(all_users)}/{data['count']}")

    return all_users

Increase Limits

If you need higher limits:

  1. Contact commercial support
  2. Present your use case
  3. Consider plan upgrade

Enterprise Plans

Enterprise plans offer:

  • Custom rate limits
  • Higher burst
  • Guaranteed SLA
  • Priority support

Monitoring

Monitor important metrics:

  • Requests per hour
  • 429 error rate
  • Average response time
  • Rate limit usage percentage

Alerts

Configure alerts for:

  • 80% of rate limit reached
  • Consecutive 429 errors
  • Response time > 2s

Next Steps