stereopsis : How to do async loads

How to do Asynchronous Loads

back

Michael Herf
February 2010

If your application fetches data from a source that is slow, like a network or disk, you will likely make the operation asynchronous someday.

There are a lot of ways to do this, and of course some of them work better than others. This article is about the high-level choices you can make, and leaves all the low-level details and language-specifics (async disk and network reads, AJAX, threads and signaling, etc.) up to you.

What to do about multiple events?

Let's say your user likes to click, a lot.

"Find this data. Load that. View these 50 pictures." Your server is overloaded, or your laptop's disk is defragging, and your "async" code isn't coming close to keeping up with all these clicks.

Handling this case gracefully is a question you should know how to answer.

What you'll probably do first (and shouldn't keep doing)

Wait for user event
Spawn thread / make async network request
Wait for callback

Many years ago, I was amazed to see a version of Windows XP spawn a thread per thumbnail in the "thumbnail" Explorer view, and lock up my entire computer. The reason you don't want to scale the number of async requests with the number of "events" is that it is inefficient.

Users don't usually want all their data delivered exactly at the same time. Disks don't work so well retrieving 50 files at a time.

In a browser, you're naturally limited by the "max connections per server" implementation of your browser, but if you make 50 requests at once, you'll end up queueuing requests forever, overwhelm your server, and you'll wait a long time.

You can probably avoid a bunch of this work if you are a little smarter.

A quick hack you shouldn't do

If you want to fix this problem in a few seconds because your server is falling over, you'll usually do something like this:

While user is doing something, wait.
After user is idle for a second, fire last-seen request.

This adds a second of latency to your UI (for no good reason), makes it seem like your app is lazy, and the worst case (user clicks every 1.1 seconds) is still bad. My advice is: just don't mess with random timers. Only do this if it's the only thing you can push live in 2 minutes while your service is falling over.

The Limited Queue

A considerably better approach is to limit the number of outstanding requests to a fixed number, like 1 or 2. Here's how this works:

On user event -> push event into a n-element queue.
If the number of outstanding requests is <n, fire request immediately.
When an outstanding request works or fails, fire a pending queued request if one is waiting.

The "limited queue" approach makes your app respond nicely all the time. The worst case increases latency for a series of user events to 2x the latency for a single request, but that's all.

Depending on your UI, a 1-element queue (plus an outstanding request) may be all you need. Most photo viewers work well like this, for instance.

Some applications require that all events are actually processed, and for these you may find that a "request stack" is preferable to a queue, since the user will see the most recent content requested, first. I've used this for displaying lists of thumbnails that a user scrolls through very quickly.

How do you do better than 2x latency? Cancel.

If you have tuned your queue properly (e.g., to use 'n' cores or 'n' connections or 'n' disk spindles) you still will have a worst-case latency of 2x single-request latency under certain input conditions, because you'll have 'n' requests waiting on 'n' resources.

A solution that works in threaded environments (but not as much in the AJAX world) is to allow an asynchronous request to be cancelled.

For instance, if you are loading images, you might interrupt a thread (politely) and ask it to stop working, freeing up CPU or disk for new work.

This approach is very important, and you should implement an interrupt/cancel feature for all your threaded workers. Measure how quickly this "cancel" request responds, and insist that the number is low.

This kind of implementation is currently a bit more difficult with a remote server. You can use an AJAX abort() but it probably won't help your server too much, unless you know some magic way (which I don't).

How do you do better than 1x latency? Prefetch.

If you can predict what a user is likely to do next (e.g., hit the "next" button 5 times) you can probably fetch this data in advance.

Here's a nice way to do that:

Make a second "low priority" queue of items to prefetch. Make an interface to cancel these requests as well.
When your "limited queue" of high-priority items is empty, fill in with the low-priority queue items.
Make sure you know when a new request is already being prefetched, and merge the request with outstanding requests.

If you're on a desktop machine, you can also tweak thread priorities when prefetching (lower priority) vs. when a user is waiting for data (higher priority.)

Fast async jobs

Some developers avoid async work because some operations are already very fast. What if a job usually takes a very tiny amount of time, like 10ms? But every once in a while takes 1000ms?

A good approach is to set a deadline for your "UI thread", like this:

Fire all "fast async" requests
Wait 20ms, see what's done
If stuff is done, display it!
Otherwise display a spinner or loading message.

This "timeout" keeps you from drawing a spinner/etc. for one frame, then immediately updating with loaded data, and it also keeps you from blocking a UI on an operation that takes a long time every so often. You can make up other rules for your application, but an approach like this gives you the best of the "async" and "blocking" worlds without a blocking too long in the worst case.