Varnish reordering query string

(Update: now listed as a module on the official Varnish site)
(Update: this code is being used in production without any problems in several companies I worked for)

In Varnish the URL is the key to the caching. If it recognises a previously requested URL it will look if it’s available in its cache and deliver this back.  There is a small problem with URLs which have parameters. Take a look at the following queries:

http://localhost/test?ddd=444&bbb=222&ccc=333&aaa=111
http://localhost/test?ccc=333&aaa=111&ddd=444&bbb=222
http://localhost/test?bbb=222&ccc=333&ddd=444&aaa=111

Each of them will return the same result, the parameters are the same, only the order is different.  Varnish treats each of them as a different query and will, in this case, do three separate requests to the backend and cache all of them.

To deal with this I’ve written a small bit of C code that can be embedded in the varnish configuration file which will order the parameters so URLs with unordered parameters will become the ordered and therefor have an equal cache key.

To follow our example the three URLs all get ordered like this:

http://localhost/test?aaa=111&bbb=222&ccc=333&ddd=444

How it works:
I tokenise the url, put the parameters in a binary tree and do an in order traversal to get them out again. Performance of this method is on average O(n log(n)).

Code can be found here: https://github.com/cyberroadie/varnish-urlsort
It has a FreeBSD license so please feel free to use it. One warning though: I haven’t used it in a live environment, so do take care!!

Olivier

Update: As suggested on the Varnish mailing list it now also compiles as a Varnish module and can be used like this (once installed):

import urlsort;
sub vcl_recv {
  set req.url = urlsort.sortquery(req.url);
}

Update:

Checked memory usage with valgrind, no leaks 🙂

valgrind -v --dsymutil=yes ./urlsort "http://localhost/test?ddd=444&bbb=222&ccc=333&aaa=111"
==66758== 
==66758== HEAP SUMMARY:
==66758== in use at exit: 6,241 bytes in 33 blocks
==66758== total heap usage: 39 allocs, 6 frees, 6,445 bytes allocated
==66758== 
==66758== Searching for pointers to 33 not-freed blocks
==66758== Checked 488,280 bytes
==66758== 
==66758== LEAK SUMMARY:
==66758== definitely lost: 0 bytes in 0 blocks
==66758== indirectly lost: 0 bytes in 0 blocks
==66758== possibly lost: 0 bytes in 0 blocks
==66758== still reachable: 6,241 bytes in 33 blocks
==66758== suppressed: 0 bytes in 0 blocks
==66758== Rerun with --leak-check=full to see details of leaked memory
==66758== 
==66758== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
--66758-- 
--66758-- used_suppression: 1 OSX107:blah
==66758== 
==66758== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
Advertisements

One thought on “Varnish reordering query string

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s