Selectors that People Actually Use


This post has been a long time coming. It’s a combination of my distrust for JavaScript CSS selector performance analysis and my disdain for the CSS 3 Selector specification.

To start, I want to give a little bit of history regarding jQuery’s selector engine. When I first started working on its implementation it was mid-2005. It was mostly done as a personal challenge to myself – implementing a specification for kicks. You can see some of my early thoughts in a post that I wrote on Selectors in JavaScript. I completed my implementation on the same day as the first, other, JavaScript CSS selector engine: cssQuery. I then held of and merged it with some of my other efforts, which eventually resulted in jQuery. When I first implemented the engine, I went for full CSS 3 compliance (mashing in XPath-capable queries, as well).

Fast-forward 7 months and jQuery is starting to get a capable community. In preparation for jQuery 1.0 I decide to analyze the features of the engine to see what people are actually using. I ran a poll (which, unfortunately, has been lost) in which I asked the users which selectors they used. As it turned out there were a great number that no-one had any use for, whatsoever (and, in fact, this remains true to this day). At this point I removed them from the engine, breaking compliance with the CSS 3 Selector spec. Here’s some of the selectors that were removed:

  • E:root – Rarely used in HTML. You already know what the root node is – it’s named ‘html’.
  • E:empty – This might be useful if it could include empty whitespace text nodes, but it doesn’t. This will only match elements like <img/> and <hr/>, for whatever use that is.
  • E:lang(fr) – This could be achieved in so many other ways – but in the end, how many multi-language-on-the-same-page sites are there?
  • E:nth-of-type(n) – I’m not sure what the motivation was for creating all the -of-type methods, I’m sure it sounded great on paper, but in the world of HTML it’s not very useful.
  • E:nth-last-child(n) – Another “great on paper” method. Don’t think I’ve ever seen it used.
  • E:nth-last-of-type(n)
  • E:first-of-type
  • E:last-of-type
  • E:only-of-type
  • E:only-child – When does this occur? and why would you need to select it?
  • E ~ F – Only selects adjacent elements, in one direction. Why a ~? Why only one direction?
  • E + F – Only the next element – rarely useful.
  • E[foo~="bar"] – Only matches values in a space-separated list. This is only useful for classes (which is taken care of with .class) and the ref attribute. Why not just use *=?
  • E[hreflang|="en"] – Another selector that is really only useful for a single attribute – and not a popular one, at that.

What’s fascinating is that no one has ever, ever, requested that these features be added back in. They have virtually zero real-world use and applicability. In fact, with the exception of “E + F” all of these selectors were added, exclusively, in the CSS 3 specification. I’m not completely sure what the thought process was in selecting them, but it’s pretty obvious that it wasn’t grounded in application, but in theory (which isn’t really the spec-writers fault, considering that there were very few CSS 2-compatible implementations at the time).

Only later, after performance test suites started to arrive, did people start to care about the existence of – and the performance of – these selectors (and hence why selectors like +, ~, and [foo~=bar] now exist in jQuery).

To compensate for the shoddy offering of current CSS selectors, JavaScript libraries have had to write whole supersets of selector functionality to compensate for missing features. For example, jQuery includes both new selectors (such as “:hidden” and “:has()”) and new selector methods (like “.parent()” and “.prev()”) – all of which provide the user which phenomenally more functionality and clarity than the what is in CSS 3.

Now, I’m sure I’ll probably get lots of feedback saying “but ‘E + F’ can be useful, look at this example” or “of course ~= is useful, you can use it on rel attributes” – that’s not the point. The fact is that they are woefully un-used. To the point that they are a burden upon the implementors of the specification. What’s the point of implementing the above features – or more importantly: optimizing the above features for speed – if no one is using them.

Which leads me to my next bone to pick:

Performance isn’t Compliance

Everyone and their brother seems to use the SlickSpeed selector speed test suite. That’s fine, as far as implementation goes it’s a pretty good take on the matter. It runs quickly, spits out pretty results – users love it. However, it’s doing two things – and that’s one thing too many: It’s testing for both performance AND compliance of the selector engines. For example, if a user were to run the tests and see poor performance for, oh say, :nth-child(2n+1), they would be shocked, nay, appalled at the overall performance of that selector engine. But here’s the rub: That’s from a selector that is virtually un-used. (:nth-child is occasionally useful, in and of itself, but the An+B syntax is virtually worthless). But this is a point on which SlickSpeed does not care – since it’s also testing for compliance, in addition to performance, all tests are treated equally and “without bias.”

However, that’s precisely what isn’t needed: Selectors require bias. I’ve often argued that the speed of an ID selector is far more important than the speed of an attribute selector (for example) because of how commonly it’s used. However, up until this point I’ve never had data to back up this claim. I have resolved that.

I present to you the most commonly used CSS selectors used by jQuery from 59 of the most popular jQuery-using sites (which was borrowed from the featured sites list). (Fun fact: The use of $(DOMElement) was more popular than all other selectors combined.) Here’s a small selection of the selectors found:

Selector % Used # of Uses
#id 51.290% 1431
.class 13.082% 365
tag 6.416% 179
tag.class 3.978% 111
#id tag 2.151% 60
tag#id 1.935% 54
#id:visible 1.577% 44
#id .class 1.434% 40
.class .class 1.183% 33
* 0.968% 27
#id tag.class 0.932% 26
#id:hidden 0.789% 22
tag[name=value] 0.645% 18
.class tag 0.573% 16
[name=value] 0.538% 15
tag tag 0.502% 14
#id #id 0.430% 12
#id tag tag 0.358% 10

View the rest of the selectors with a full explanation…

I spidered the JavaScript of all the sites, parsed through them, and found the appropriate selectors to compile the list. Here’s some things that I learned from the data:

There doesn’t seem to be a correlation between performance and selector use. For example, “.class” is far more popular than “tag.class” even though the second one is much more performant. What’s especially important about this is that the degree of performance hit isn’t that much of an issue. For example, the difference between 4ms and 30ms is virtually imperceptible. Instead there is an overwhelming trend towards simpler selectors. Obviously, user education could help, but it’s unclear as to how much that will change things in the end.

A couple of jQuery’s custom selectors are immensely popular: :visible, :hidden, and :selected. However it is unclear as to how useful they would be outside of a JavaScript-based CSS selector engine (there’s no real point in styling a hidden element).

A bunch of jQuery’s convenience selectors: :checkbox, :radio, :input, etc. would be quite useful, within a CSS selector spec – and it’s good to see them in wide use here.

There’s a bunch of unexpected queries that are used: “*”, “.class .class”, “[name=value]“, and “#id #id”. These types of queries are grossly under-represented in current performance test suites.

…there’s one thing that needs to be taken from all of this data, though: Speed test suites need to test reality rather than specification.

I’m perfectly ok with having two completely separate suites, one focusing on speed and one focusing on compliance, however mixing them does no one any favors: Users get a confused perception and suite authors (and browser vendors!) waste time dealing with optimizing things that don’t matter.

My proposal: A standardized performance suite (based on SlickSpeed, is fine) but populating the tests with comparable selectors to the ones shown above and weighted based upon their relevance. Thus, the speed of the #id selector should, actually, consume 51.29% of the total final score. This means that being 3x faster at this test would actually be 102x more important than becoming 3x faster at “tag tag”. This is absolutely not represented in any, current, test suites and needs to be rectified. Of course, I’ll be happy to seed my selector list with results from other popular sites of other selector libraries.

I’ll be fine constructing this suite, as well – I just want to make sure that there’s enough interest. I think this proposal has a lot of merit and should be strongly considered – the result will be a selector performance suite which will benefit everyone.

Posted: February 12th, 2008


If you particularly enjoy my work, I appreciate donations given with Gittip.

70 Comments (Show Comments)



Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.


Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

Ukiyo-e Database and Search

Ukiyo-e.org

Japanese woodblock print database and search engine.


John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.