What is Rate Limiter

Restricts number of incoming HTTP Requests/REST API calls from client to server, which can overload the server.
Eg: Only 2 HTTP Posts/sec, 10 accounts creation allowed from same IP address, 300 tweets/3 hours.
AWS API gateway is a fully managed service that supports rate limiting, SSL termination, authentication, IP whitelisting, servicing static content, etc
Benefits of RL
Prevent resource starvation caused by Denial of Service (DoS) attack
Limiting excess requests means fewer servers(Cost reduction)

1. Requirements

Functional

CRUD(Create=POST, Read=GET, Update=PUT, Delete=DELETE)
1. Limit excessive requets (Criteria: UserId, IP Address etc)
2. Inform user when his request is dropped

Non Functional

S3:(Scalable,Secure,SOA), L3:(Latency, Load, Logging), A3:(Accurate, Available, Authenticated), C2:(Cache), R2:(Reliable, Redundant) F2:(fast, Fault Tolerant)
1. Latency should be Low. ie dropped user should be quickly informed
2. Low Memory usage
3. Highly Reliable. (ie fault Tolerant)

2. HLD

Rate limiting can be implemented using different algorithms, and each of them has distinct pros and cons.

Rate Limiting Algorithms

Name Used By Description
Token Bucket Amazon, Stripe - Description
- Advatanges: Easy to implement, Memory Efficient, Allows a burst of traffic for short periods, ie requests can go until tokens are present then request stop for some time
- Disdvatanges: Two parameters in the algorithm are bucket size and token refill rate, which might be challenging to tune.
Leaky Bucket Shopify - Description
- Advatanges: Memory efficient given the limited queue size, Requests are processed at a fixed rate(smoothens the bursty traffic)
- Disdvatanges: There are two parameters in the algorithm. It might not be easy to tune them properly

Req1. Limit excessive requets (Criteria: UserId, IP Address etc)

Rate limiting is done using above Rate limiting algorithms
Ratelimiting HW will store the rules on the disk. Workers frequently pull rules from the disk and store them in the cache
Counters
To keep track of how many requests are sent from the same user, IP address, etc. If the counter is larger than the limit, the request is disallowed

For 1 million users:
  5 million counters. if each user is rate limited on 5 parameters

Userid      TCP SYN Rcvd/sec(Connect req)   HTTP GET Rcvd
  1                10                           5
  5                1000                         5000      //Pumping huge traffic
          

Where to store counters? REDIS
Option-1: File on Hard Disk (slow)
Option-2: inmemory store = Redis
Rate Limiting Rules
Requests allowed in particular category.
Rules are stored on the disk. Workers frequently pull rules from the disk and store them in the cache

Examples:

domain: messaging
descriptors:
  - key: message_type
    Value: marketing                //Only 5 marketing messages/day.
    rate_limit:
      unit: day
      requests_per_unit: 5

domain: auth
descriptors:
  - key: auth_type
    Value: login                 // Only 5 login requests/minute
    rate_limit:
      unit: minute
      requests_per_unit: 5      
        
Architecture & Flow
trie Flow:
1. Client sends a request to the server, the request is sent to the rate limiter middleware first.
2. Rate limiter middleware loads rules from the cache. It fetches counters and last request timestamp from Redis cache.
3. Based on the response, the rate limiter decides
  Forward to API Server, if not rate limited
  Returns 429 too many requests, if the request is rate limited

Req2. Inform user when his request is dropped

User is informed using HTTP Response headers
- X-Ratelimit-Remaining: The remaining number of allowed requests within the window.
- X-Ratelimit-Limit: It indicates how many calls the client can make per time window.
- X-Ratelimit-Retry-After: The number of seconds to wait until you can make a request again without being throttled

Issues

1. Race condition

Race Condition in reading counter from Redis
• Read the counter value from Redis.
• Check if ( counter + 1 ) exceeds the threshold.
• If not, increment the counter value by 1 in Redis.
- Assume the counter value in Redis is 3. If two requests concurrently read the counter value before either of them writes the value back, each will increment the counter by one and write it back without checking the other thread.
- Both requests (threads) believe they have the correct counter value 4. However, the correct counter value should be 5.
trie
How to avoid Race Condition?
Lua script and sorted sets data structure in Redis

2. Synchronization issue