Serving GZip Compressed Content from the Azure CDN

zipper

GZip Compression

There are two big reasons why you should compress your content: time & money. Less data transferred over the wire means your site will be faster for users and bandwidth costs will be lower for you. Compressing the content will require some CPU, but web servers are often IO-bound and have cycles to spare. Plus, modern web servers do a good job of caching compressed content so even this cost is minimal. It’s almost always worth it to enable compression on your web server.

IIS supports compression of both dynamic (generated on-the-fly) and static content (files on disk). Both types of compression can yield big improvements, but for this post we’ll focus on static content and more specifically style sheets (.css) and JavaScript (.js). Of course there are many other types of static content like images and videos. However, compression is built into many of these file formats so there is little benefit in having the web server try to compress them further.

Content Delivery Networks (CDNs)

Compression reduces the total amount of data your server needs to send to a user. But it doesn’t reduce latency. Users that live far from your server will have longer response times because the data has to travel through more network hops. CDNs reduce latency by dispersing copies of the data throughout the world so its closer to end-users. Luckily there is a wide selection of CDNs that are easy to use and don’t require any long-term commitment. At FilterPlay we use the Azure CDN since it’s integrated with Azure Blob storage which we use extensively.

Compression and the Azure CDN

Unfortunately many CDNs do not support automatic gzip compression of content. This includes popular CDNs such as Amazon CloudFront as well as Windows Azure. We can work around this limitation by storing a secondary copy of the content that has been gzipped. It’s important to also provide the uncompressed version of the file because there are a small number of users with an old browser or old anti-virus software that doesn’t support compression. Other times users are behind a proxy server which strips the Accept-Encoding header.

Here is the code to create a gzip copy for every css and js file inside an Azure blob container:

/// <summary>
///   Finds all js and css files in a container and creates a gzip compressed
///   copy of the file with ".gzip" appended to the existing blob name
/// </summary>
public static void EnsureGzipFiles(
    CloudBlobContainer container,
    int cacheControlMaxAgeSeconds)
{
    string cacheControlHeader = "public, max-age=" + cacheControlMaxAgeSeconds.ToString();

    var blobInfos = container.ListBlobs(
        new BlobRequestOptions() { UseFlatBlobListing = true });
    Parallel.ForEach(blobInfos, (blobInfo) =>
    {
        string blobUrl = blobInfo.Uri.ToString();
        CloudBlob blob = container.GetBlobReference(blobUrl);

        // only create gzip copies for css and js files
        string extension = Path.GetExtension(blobInfo.Uri.LocalPath);
        if (extension != ".css" && extension != ".js")
            return;

        // see if the gzip version already exists
        string gzipUrl = blobUrl + ".gzip";
        CloudBlob gzipBlob = container.GetBlobReference(gzipUrl);
        if (gzipBlob.Exists())
            return;

        // create a gzip version of the file
        using (MemoryStream memoryStream = new MemoryStream())
        {
            // push the original blob into the gzip stream
            using (GZipStream gzipStream = new GZipStream(memoryStream, CompressionMode.Compress))
            using (BlobStream blobStream = blob.OpenRead())
            {
                blobStream.CopyTo(gzipStream);
            }

            // the gzipStream MUST be closed before its safe to read from the memory stream
            byte[] compressedBytes = memoryStream.ToArray();

            // upload the compressed bytes to the new blob
            gzipBlob.UploadByteArray(compressedBytes);

            // set the blob headers
            gzipBlob.Properties.CacheControl = cacheControlHeader;
            gzipBlob.Properties.ContentType = GetContentType(extension);
            gzipBlob.Properties.ContentEncoding = "gzip";
            gzipBlob.SetProperties();
        }
    });
}

Get the full source code for the CloudBlobUtility class which includes the utility methods referenced in the snippet above:

You may be wondering why we didn’t use the standard .gz extension for our gzipped copies. We need to use .gzip because Safari doesn’t correctly handle files with the .gz extension. Yes, its strange.

In addition to telling the client how long it should cache the data, the max-age in the cache control header also determines how long an CDN edge node will cache the data. I’d recommend that you use a large expiration and assume that once released, the data will never change. Include a version in the filename and rev your URLs when you need to make an update.

The max-age must be a 32-bit integer because that’s the largest value IE supported before version 9. Just wanted to mention that in case you were considering changing the cacheControlMaxAgeSeconds parameter type. I’ve seen cases in the wild where the max-age was even larger, probably because a developer assumed that bigger was better. Don’t worry, the max value of a 32-bit int means your content won’t expire for about 68 years.

If you are using jQuery, YUI, or another popular JavaScript library, don’t serve it from your own CDN account. Instead use the copy that Google, Microsoft or Yahoo provides. Your users probably already have a copy in their cache.

Detecting GZip Support

Once you have both uncompressed and gzipped versions of your content in the cloud, you need to modify your pages to vary the URL based on whether the client supports gzip compression. Keep in mind that if you are using output caching, you’ll need to vary the cache by the Accept-Encoding header. The following helper method checks the current request’s headers and appends the .gzip extension if compression is supported:

/// <summary>
///   Appends .gzip to the end of the url if the current request supports gzip
/// </summary>
/// <example>
///   Asp.net Razor syntax: @Cdn.GZipAwareUrl("http://cdn.domain.com/script.js")
/// </example>
public static string GZipAwareUrl(string url)
{
    HttpContext context = HttpContext.Current;
    if (context != null)
    {
        HttpRequest request = context.Request;
        if (request != null)
        {
            string encoding = request.Headers["Accept-Encoding"];
            if (encoding != null && encoding.Contains("gzip"))
            {
                return url + ".gzip";
            }
        }
    }

    return url;
}
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

11 Responses to Serving GZip Compressed Content from the Azure CDN

  1. Nariman says:

    Nicely done! Just wondering when you’re executing the above routine? We’re appending incremental build #s to each CDN request, meaning storage isn’t populated for a new role until the first requests come in.

  2. joelfillmore says:

    @Nariman – Before publishing my Azure web role, I run a deployment script which appends a version number to js and css files, uploads them to blob storage, and also calls EnsureGzipFiles() to create gzip versions.

    It sounds like you are using the new Azure feature where a CDN can pull content from a hosted web role instead of blob storage? That feature was released around the time I wrote this post. According to this thread on the Azure forums (http://social.msdn.microsoft.com/Forums/en/windowsazuredata/thread/8d03db3a-4649-42fd-8d88-c80360f06b1f), the cache for hosted services varies by encoding, so you shouldn’t have to do anything to get gzip served from the CDN! I’ll probably switch to that method at some point to avoid the extra deployment step.

    You can verify it works by looking at the Content-Encoding header in the response using the network view in your browser dev tools or with Fiddler (http://www.fiddler2.com/fiddler2/).

  3. Mad Pierre says:

    Hi

    I’m just trying to implement your code and get an error: “The specified resource does not exist.” at the gzipBlob.UploadByteArray(compressedBytes); line.

    Any ideas?

    • Mad Pierre says:

      Ignore that! I’ve worked it out!

      It was because I was connecting anonymously – which is fine for reading but not writing.

      As usual in Azure the error message does not even vaguely point you in the right direction!

  4. Pingback: Alexey Bokov’s weblog » Blog Archive » Windows Azure CDN

  5. Matt says:

    Seems pretty silly that you have to do this. I wonder if any CDNs now support gzip out of the box?

  6. Pingback: Why and How We Migrated babylon.js to Azure: CORS, gzip, and IndexedDB

Leave a comment