API Rate Limit Guide

How to handle rate limits and optimize API usage

Exchange Rate Limits

Binance

  • 1200 requests/minute
  • 10 orders/second
  • 100,000 orders/day

Coinbase

  • 10 requests/second
  • 30 requests/minute (advanced)

Kraken

  • 15 requests/second
  • Varies by endpoint

OKX

  • 45 requests/2 seconds
  • 20 orders/second

Exponential Backoff

import time
import random

def make_request_with_retry(func, max_retries=5):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError as e:
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            print(f"Rate limited. Waiting {wait_time:.2f}s")
            time.sleep(wait_time)
    
    raise Exception("Max retries exceeded")
        

Request Caching

from functools import lru_cache
import time

cache_duration = 60  # seconds

@lru_cache(maxsize=1000)
def cached_request(url):
    # Cache responses for 60 seconds
    response = requests.get(url)
    return response.json()

# Or use Redis
import redis
r = redis.Redis()

def get_with_cache(key, func, ttl=60):
    cached = r.get(key)
    if cached:
        return json.loads(cached)
    
    result = func()
    r.setex(key, ttl, json.dumps(result))
    return result
        

💡 Optimization Tips

  • Batch requests: Use endpoints that return multiple items
  • WebSocket over REST: For real-time data, use WebSocket
  • Cache aggressively: Store prices, order books locally
  • Rate limiting client: Implement your own rate limiter
  • Prioritize critical data: Don't poll everything equally

Token Bucket Algorithm

import time
import threading

class RateLimiter:
    def __init__(self, rate, capacity):
        self.rate = rate
        self.capacity = capacity
        self.tokens = capacity
        self.last_update = time.time()
        self.lock = threading.Lock()
    
    def acquire(self):
        with self.lock:
            now = time.time()
            elapsed = now - self.last_update
            self.tokens = min(self.capacity, self.tokens + elapsed * self.rate)
            self.last_update = now
            
            if self.tokens >= 1:
                self.tokens -= 1
                return True
            return False
    
    def wait_for_token(self):
        while not self.acquire():
            time.sleep(0.1)

# Usage: 10 requests per second
limiter = RateLimiter(rate=10, capacity=10)

for _ in range(100):
    limiter.wait_for_token()
    make_api_request()
        

⚠️ Common Mistakes

  • Spamming the API when rate limited (will get banned)
  • Not checking 429 responses
  • Using the same key for multiple applications
  • Ignoring rate limit headers