Blog


getElementsByClassName Speed Comparison

Mark Finkle suggested that I do some speed testing, now that a native implementation of getElementsByClassName has landed in the Mozilla trunk (destined for Firefox 3).

So I went around and dug up all of the different, existing, implementations that I could find. Currently, implementations fall into one of three categories (with some straddling more than one):

  • Pure DOM
    This usually involves a calls to .getElementsByClassName("*") and traversing through all matched elements, analyzing each element's className attribute along the way. Generally, the fastest method is to use a pre-compiled RegExp to test the value of the className attribute.
  • DOM Tree Walker
    Is a less-popular means of traversing DOM documents by setting some simple parameters, as specified by the DOM Level 2 Spec. For example, you could traverse through all text nodes in a document (something that you can't easily do in any other way).
  • XPath
    The most recent technique, to be popularized, was the use of XPath to find elements by classname. The implementation is generally simple: Building a single expressions and letting the XPath engine traverse through the document, finding all the relevant elements.

I've chosen some implementations that were representative of each of these techniques.

Tree Walker

An implementation using the DOM Level 2 Tree Walker methods. Builds a generic filter function and traverses through all elements.

document.getElementsByClass = function(needle) {
  function acceptNode(node) {
    if (node.hasAttribute("class")) {
      var c = " " + node.className + " ";
       if (c.indexOf(" " + needle + " ") != -1)
         return NodeFilter.FILTER_ACCEPT;
    }
    return NodeFilter.FILTER_SKIP;
  }
  var treeWalker = document.createTreeWalker(document.documentElement,
      NodeFilter.SHOW_ELEMENT, acceptNode, true);
  var outArray = new Array();
  if (treeWalker) {
    var node = treeWalker.nextNode();
    while (node) {
      outArray.push(node);
      node = treeWalker.nextNode();
    }
  }
  return outArray;
}

The Ultimate getElementsByClassName

Uses a pure DOM implementation, tries to make some optimizations for Internet Explorer.

function getElementsByClassName(oElm, strTagName, strClassName){
    var arrElements = (strTagName == "*" && oElm.all)? oElm.all :
        oElm.getElementsByTagName(strTagName);
    var arrReturnElements = new Array();
    strClassName = strClassName.replace(/\-/g, "\\-");
    var oRegExp = new RegExp("(^|\\s)" + strClassName + "(\\s|$)");
    var oElement;
    for(var i=0; i<arrElements.length; i++){
        oElement = arrElements[i];     
        if(oRegExp.test(oElement.className)){
            arrReturnElements.push(oElement);
        }   
    }
    return (arrReturnElements)
}

Dustin Diaz's getElementsByClass

A pure DOM implementation, caches the regexp, and is generally quite simple and easy to use.

function getElementsByClass(searchClass,node,tag) {
        var classElements = new Array();
        if ( node == null )
                node = document;
        if ( tag == null )
                tag = '*';
        var els = node.getElementsByTagName(tag);
        var elsLen = els.length;
        var pattern = new RegExp("(^|\\s)"+searchClass+"(\\s|$)");
        for (i = 0, j = 0; i < elsLen; i++) {
                if ( pattern.test(els[i].className) ) {
                        classElements[j] = els[i];
                        j++;
                }
        }
        return classElements;
}

Prototype 1.5.0 (XPath)

Mixes an XPath and DOM implementation; using XPath wherever possible.

document.getElementsByClassName = function(className, parentElement) {
  if (Prototype.BrowserFeatures.XPath) {
    var q = ".//*[contains(concat(' ', @class, ' '), ' " + className + " ')]";
    return document._getElementsByXPath(q, parentElement);
  } else {
    var children = ($(parentElement) || document.body).getElementsByTagName('*');
    var elements = [], child;
    for (var i = 0, length = children.length; i < length; i++) {
      child = children[i];
      if (Element.hasClassName(child, className))
        elements.push(Element.extend(child));
    }
    return elements;
  }
};

Native, Firefox 3

A native implementation, written in C++; is a part of the current CVS version of Firefox, will be included in Firefox 3.

document.getElementsByClassName

The Speed Results

For the speed tests I copied the Yahoo homepage into a single HTML file and used that as the test bed. They make good use of class names (both single and multiple) and is a considerably large file with lots of elements to consider.

You can find the test files, for each of the implementations, here:
http://ejohn.org/apps/classname/

Note: "XPath" is just Prototype's implementation.

From these figures we can see that the native implementation of getElementsByClassName, in Firefox 3, is a full 8x faster than the XPath implementation. Additionally, it's a stunning 77x faster than the fastest DOM implementation.

Note: These numbers have been revised from what was originally posted as the lazy-loading nature of document.getElementsByClassName wasn't taken into account. The resulting arrays are completely looped-through now, making sure that all elements are accounted for.

Currently, Prototype has the best general-use implementation: Use XPath selectors wherever possible, fall back to fast DOM parsing.

Interestingly, only Prototype actually tries to implement the document.getElementsByClassName interface (all others do one-off names). However, Prototype doesn't check to see if the document.getElementsByClassName property already exists, and completely overwrites the, incredibly fast, native implementation that Firefox 3 provides (oops!).

In all, the results are quite astounding. The native implementation is absolutely much faster than anything I could've imagined. It completely decimates all the other pieces of code. I can't wait until this hits the general public - users will, absolutely, feel a significant increase in speed.

Tags: javascript, speed, firefox3, firefox, whatwg, xpath, dom

XPath and CSS Selectors

Lately, I've been doing a lot of work building a parser for both XPath and CSS 3 - and I was amazed at just how similar they are, in some respects - but wholly different in others. For example, CSS is completely tuned to be used with HTML, with the use of #id (to get something by ID) and .class (to get something by its class). On the other hand, XPath has to ability to traverse back up the DOM tree with .. and test for existance with foo[bar] (foo has a bar element child). The biggest thing to realize is that CSS Selectors are, typically, very short - but woefully underpowered, when compared to XPath.

I thought it would be worth some merit to do a side-by-side comparison of the different syntaxes of the two selectors.

Goal CSS 3 XPath
All Elements * //*
All P Elements p //p
All Child Elements p > * //p/*
Element By ID #foo //*[@id='foo']
Element By Class .foo //*[contains(@class,'foo')] 1
Element With Attribute *[title] //*[@title]
First Child of All P p > *:first-child //p/*[0]
All P with an A child Not possible //p[a]
Next Element p + * //p/following-sibling::*[0]

Syntactically, I was surprised how similar the two selectors were, in some cases - especially between the '>' and '/' tokens. While they don't always mean the same thing (depending on what axis you're using in XPath), they're generally assumed to mean the child element of the parent. Also, the ' ' (space) and '//' both mean 'all descendants of the current element'. Finally, the '*' means 'all elements', regardless of their name, in both.

Even though I already knew all of this ahead of time - it's certainly nice being able to rediscover the similarities when it comes down actually having to program an implementation of them.



1 This isn't right due to the fact that it would match 'foobar' and 'foo bar', when only the second pattern is correct. The actual syntax would be far more complex and would probably require multiple expressions to get the job done.

Tags: xml, xpath, css, dom, html

Current Projects

jQuery JavaScript Library

jQuery

Comprehensive DOM, Event, Animation, and Ajax JavaScript Library.

Recent Projects

Pro JavaScript Techniques

JavaScript Book

The best techniques for professional JavaScript. Published by Apress.