Learning from Twitter

An issue popped up on Twitter this past week that caused the web site to be generally unusable for many users. It appears as if attempts to scroll were unbearably slow and caused the site to be unresponsive.

The Twitter team investigated and determined that if they reverted the version of jQuery that they used back to 1.4.2 from 1.4.4 the site would be responsive again. After more investigation they determined that the code that was slow was doing a contextual selector search for an item by class name, for example: $something.find(".class").

So – what happened? How did this come about? To start, nothing is inherently wrong with jQuery 1.4.4 – this particular performance regression came in jQuery 1.4.3. In 1.4.3 we switched from using the old Sizzle selector engine for contextual queries to using the browser’s native querySelectorAll method, if it exists. This change was even explicitly mentioned and highlighted in the 1.4.3 release notes as it’s a really good change. In general using querySelectorAll will result in much faster queries, especially for complicated queries and complicated documents (which there seem to be a lot of).

However, as with every performance change, while some things get way faster some things can also get slower. This was the case for some previously-optimized queries like .find(“.class”) (where we use getElementsByClassName, if it exists) and .find(“div”) (where we use getElementsByTagName). Both of those aforementioned methods always end up being faster than the queries run through querySelectorAll. Whether this will always end up being the case is another question entirely.

What’s interesting here is that we’ve been using querySelectorAll for our default selector engine in jQuery for quite some time now (doing $(‘.class’) would use querySelectorAll). The only change in 1.4.3 was just filling in a gap where .find(‘.class’) wasn’t using querySelectorAll. We’ve not heard of any particular performance regressions regarding the use of querySelectorAll and $(‘.class’).

This brings up the important point: Just how much faster is getElementsByClassName compared to querySelectorAll? In our preliminary tests it looks like it’s about 0.5-2x faster, depending upon the browser. While this is certainly nothing to scoff at the performance hit of this difference is quite negligible. For example the difference between searching by class name and querying in Firefox 3.6 is about 0.007s – certainly nothing that is capable of crippling a large application.

That being said, we don’t like performance regressions so today we backported some shortcuts into Sizzle (from jQuery) to improve its performance for some common cases. For example: Sizzle(“div”), Sizzle(“.foo”), and Sizzle(“#id”) will all skip using querySelectorAll and try to use the native methods provided by the browser if they exist. (jQuery already did some of these (namely “div” and “#id”, we just added the “.foo” shortcut as well).

So. If the performance hit wasn’t very large then why was Twitter having so many problems? The reality is that this particular change was just the straw that broke the camel’s back. Two things were happening that caused Twitter to have the issues that it was having. These can be distilled down into two general best practices:

Best Practices

It’s a very, very, bad idea to attach handlers to the window scroll event. Depending upon the browser the scroll event can fire a lot and putting code in the scroll callback will slow down any attempts to scroll the page (not a good idea). Any performance degradation in the scroll handler(s) as a result will only compound the performance of scrolling overall. Instead it’s much better to use some form of a timer to check every X milliseconds OR to attach a scroll event and only run your code after a delay (or even after a given number of executions – and then a delay).

Always cache the selector queries that you’re re-using. It’s not clear why Twitter decided to re-query the DOM every single time the scroll event fired, but this does not appear to be necessary (since scrolling itself didn’t change the DOM). They could’ve saved the result of that single query to a variable and looked it up whenever they wanted to re-use. This would’ve resulted in absolutely no querying overhead (which is even better than having the faster getElementsByClassName code!).

Thus, combining these two techniques, the resulting code would look something like:

var outerPane = $details.find(".details-pane-outer"),
    didScroll = false;

$(window).scroll(function() {
    didScroll = true;
});

setInterval(function() {
    if ( didScroll ) {
        didScroll = false;
        // Check your page position and then
        // Load in more results
    }
}, 250);

Hope this helps to clear things up and provides some good advice for future infinitely-scrolling-page developers!

Posted: January 20th, 2011


Subscribe for email updates

50 Comments (Show Comments)



Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.


Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

John Resig Twitter Updates

@jeresig / Mastodon

Infrequent, short, updates and links.