Throttling Web API Calls

Sign outside Wallington, England (CC-BY by anemoneprojectors)

From Amazon to Zillow, there are thousands of sites which provide access to data via an API. At FilterPlay we use lots of e-commerce APIs to retrieve product data and update prices used in our comparison engine. Our back-end system updates millions of items every day and fortunately many of these API calls and updates can be parallelized. Most APIs provide limits to help ensure the service remains available and responsive. It’s important to throttle your API calls to stay within the limits defined by each service.

Using a Semaphore to Limit Concurrency

Every programming language provides synchronization primitives to control access to a resource shared by multiple threads. For example, the lock keyword in C# will restrict execution of a block of code to a single thread at any one moment in time. A semaphore can be used to give multiple threads concurrent access to a resource. However, most web APIs also include a time window. For example, the BestBuy e-commerce API specifies that developers only make 5 calls every second. Since a web request can finish in under a second, its not enough to limit the number of calls using a semaphore. The following example illustrates the use of a semapahore which is set to only allow 5 concurrent workers. We’ll create 6 worker threads which perform 300ms of “work” after entering the semaphore:

static void DoWork(int taskId)
    DateTime started = DateTime.Now;
    Thread.Sleep(300);  // simulate work
        "Task {0} started {1}, completed {2}",

static void StandardSemaphoreTest()
    using (SemaphoreSlim pool = new SemaphoreSlim(5))
        for (int i = 1; i <= 6; i++)         
            Thread t = new Thread(new ParameterizedThreadStart((taskId) =>
        Thread.Sleep(2000); // give all the threads a chance to finish

    // Task 1 started 51.229, completed 51.540
    // Task 2 started 51.229, completed 51.540
    // Task 3 started 51.258, completed 51.558
    // Task 4 started 51.258, completed 51.558
    // Task 5 started 51.260, completed 51.560
    // Task 6 started 51.540, completed 51.840

Note that Task 6 starts immediately after Task 1 is completed and exits the sempahore. The simulated work only takes 300ms so all six workers easily finish in under a second, exceeding our limit of 5 per second. One solution would be to sleep for a second after every request. However, blocking a worker after its done using the resource isn’t a good idea. In our simple example thats not obvious because the thread exits after performing its work on the shared resource. However, in a real scenario you’ll call a web API to obtain some data and then process the results. It’s important that you don’t do the post-processing while holding a lock. We also shouldn’t block that work just to ensure a subsequent caller doesn’t exceed our limit. The solution is to couple a semaphore with a time span which must elapse before the caller can acquire a lease on the resource. I created a TimeSpanSemaphore class which internally uses a queue of time stamps to remember when the previous worker finished.

Don’t Forget the Transit Time

Its important to explain why we need to track time stamps from the moment when each action completes. My initial implementation simply reset a lock pool after each time period had elapsed. That may work perfectly for some throttling scenarios, but for web APIs we have to remember that we’re trying to obey a limit that is enforced on a remote server. BestBuy, Twitter, or Amazon don’t care whether you only send a certain number of requests per second, they can only observe how many requests per second they receive from your application. The variable time it takes a request to arrive on the remote server can cause you to violate the limits if you only use the time when requests are sent. Here’s an example:

Time (sec) Event
0.000 Requests 1-5 are sent to the server
0.700 The requests all arrive at the server
0.800 All requests return with data
1.000 1 second has elapsed so request 6 is sent to the server
1.100 Request 6 arrives much faster than the first 5 requests

The remote server sees that 6 requests arrived between 0.700 and 1.100 when only 400 ms have elapsed, violating the API limits.

Using the TimeSpanSemaphore

Instead of exposing the Wait() and Release() methods publicly, our TimeSpanSemaphore class provides a Run method which accepts an Action delegate and supports cancellation tokens. We also ensure that the lock is released if an exception occurs. Here is our previous example using the TimeSpanSemaphore class instead of the standard semaphore:

using (TimeSpanSemaphore throttle = new TimeSpanSemaphore(5, TimeSpan.FromSeconds(1)))
    for (int i = 1; i <= 6; i++)          
        Thread t = new Thread(new ParameterizedThreadStart((taskId) =>
                () => DoWork((int)taskId),
    Thread.Sleep(2000); // give all the threads a chance to finish

// Task 2 started 53.276, completed 53.576
// Task 1 started 53.276, completed 53.576
// Task 3 started 53.276, completed 53.576
// Task 4 started 53.278, completed 53.579
// Task 5 started 53.279, completed 53.579
// Task 6 started 54.598, completed 54.898

You can see that like the first example, the first 5 requests all start and complete around the same time. However, task 6 waits a full second after the completion of the first task before starting. In theory we wouldn’t have to wait a full second if we knew the how the request lifetime was spent (travel time to/from + server processing). However, we don’t know exactly when the remote server will decide to count the request. Its safer to assume the entire lifetime of the request was spent travelling to the server, but the next request might arrive instantly. This means you can’t use the API to max capacity, but its better to err on the side of caution rather than exceed the limits. The effect will be larger if the concurrent worker count or timespan are small. If you really need every last API call, you could adjust the count and/or timespan to account for this delay.

One tip for those who are making many concurrent API calls. By default .net only allows 2 concurrent requests per hostname. You can set the DefaultConnectionLimit of the ServicePointManager when you initialize your program (only needs to be set once) or in the .config file.

Show Me the Code

The source code for the TimeSpanSemaphore class is available in my GitHub repository JFLibrary. Over time I’m planning to add more utility code that I frequently use. I’d love to hear your feedback, bug reports or a different solution you’ve used to rate limit API calls.

This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

15 Responses to Throttling Web API Calls

  1. Alex Tran says:

    Appreciate this post. Was looking for a better way to throttle outgoing API calls…

    This points me in the right direction.

  2. Jim says:

    I’m a little late to the party, but great article nonetheless!

  3. Pingback: Throttling web requests | Sagui Itay

  4. Chris P says:


    I am looking for similar solution but I think even more complicated as I posted here:

    Basically i am looking to control such API calls which are executed by users against many different API providers through at the same time. The queue needs to be shared among each User executing these calls in parallel. User goes to a web page, executes a search which triggers multiple parallel API calls and other users do the same.

    I was thinking that I need to use a message queuing service like RabbitMQ.. Any thoughts would be appreciated if you think there could be a simpler way that still is pretty fast (trying to avoid DB for writing queue information and store it in memory).


  5. Hoang says:

    I’m not sure how you get the TimeSpanSemaphore working because I tried your code and it threw an exception {“The semaphore has been disposed.”}

  6. Hoang says:

    OK, so the Using statement was causing the problem. Not sure why. Take the TimeSpanSemaphore out of Using statement and it works.

    • joelfillmore says:

      Oops, the call in the test loop to sleep for 2 seconds should have been inside the using statement. I fixed the test code above and also checked in the unit test.

      In actual use, you probably wouldn’t wrap the TimeSpanSemaphore in a using statement. I only did that for the test loop to make it obvious that it should be properly disposed once you are done with it.

      Underneath the covers, the TimeSpanSemaphore is using the framework’s SemaphoreSlim class, which is what is throwing the exception you saw. It’s likely they added the disposal check in the past few years since I originally wrote the post.

      Thanks for catching this!

      • Hoang says:

        Not sure I understand the need to sleep for 2 seconds and its implication.

        The other issue I saw is that if I increase the resetSpan to larger number, the same exception will happen.

        using (TimeSpanSemaphore throttle = new TimeSpanSemaphore(5, TimeSpan.FromSeconds(60)))

      • joelfillmore says:

        The sleep for 2 seconds is just to wait for the threads to complete before ending the test. In production scenarios the throttles will often be retained for long periods of time. So you wouldn’t use the sleep or the using statement. You just need to dispose the throttle once you done with it and all workers have completed.

  7. Andy says:

    I think this might be a solution to my problems for 2 days now, but I’m not sure how to use it. Essentially I have to loop through an API (too many request), each api url will have a different ID then I save them in an object. For example, this is how my GetParticipanAsync(meetingList) behaves:
    foreach(var n in meetinglist)
    var res = client.GetAsyncString(“/api/meeting/n.ID/participants)
    var str = await res.ConfigureAwait();

    Do I call GetParticipanAsync like this?
    using (SemaphoreUtil.TimeSpanSemaphore throttle = new SemaphoreUtil.TimeSpanSemaphore(5, TimeSpan.FromSeconds(1)))
    for(int i=1;i
    throttle.Run(() => GetParticipantAsync(meetingList).Wait(), CancellationToken.None);



    Thank you Joel

    • joelfillmore says:

      Hey Andy, a lot has changed since I originally wrote the TimeSpanSemaphore class! With the Task Parallel Library and async / await semantics I probably would have done things differently. My use case was also a little different, I had an variable number of threads which could be trying to make API calls concurrently.

      You certainly could use the TimeSpanSemaphore for looping through a list of API calls, but I might be temped to do something much simpler like adding a call to “await Task.Delay(1000)” to wait 1 second at the end of each loop iteration in GetPartipantAsync(). That should get you unblocked and then you could look at other solutions if you really needed to eek out every last potential api call.

      Hope this helps!

      • Andy says:

        I just realized that this was an old post. Thank you so much for this post got me started, you helped a lot of souls haha. I will try the suggested tweak. More power!

  8. Pingback: Throttling web requests - Sagui Itay - Unity Assets and software development

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s