Blog
June 19th, 2009
I just posted a run down of some of the new DOM Traversal APIs in Firefox 3.5. The first half of the post is mostly a recap of my old Element Traversal API post.
The second half of the post is all about the new NodeIterator API that was just implemented. For those that are familiar with some of the DOM TreeWalker APIs this will look quite familiar.
It's my opinion, though, that this API is, at best, bloated, and at worst incredibly misguided and impractical for day-to-day use.
Observe the method signature of createNodeIterator:
var nodeIterator = document.createNodeIterator(
root, // root node for the traversal
whatToShow, // a set of constants to filter against
filter, // an object with a function for advanced filtering
entityReferenceExpansion // if entity reference children so be expanded
);
This is excessive for what should be, at most, a simple way to traverse DOM nodes.
To start, you must create a NodeIterator using the createNodeIterator method. This is fine except this method only exists on the Document node - which is especially strange since the first argument is the node which should be used as the root of the traversal. The first argument shouldn't exist and you should be able to call the method on any DOM element, document, or fragment.
Second, in order to specify which types of nodes you wish to see you need to provide a number (which is the result of the addition of various constants) that the results will be filtered against. This is pretty insane so let me break this down. The NodeFilter object contains a number of properties representing the different types of nodes that exist. Each property has a number associated with it (which makes sense, this way the method can uniquely identify which type of node to look for). But then the crazy comes in: In order to select multiple, different, types of nodes you must OR together the properties to creating a resulting number that'll be passed in.
For example if you wanted to find all elements, comments, and text nodes you would do:
NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_COMMENT | NodeFilter.SHOW_TEXT
I'm not sure if you can get a much more counter-intuitive JavaScript API than that (you can certainly expect little, to no, common developer adoption, that's for sure).
Next, the filter argument accepts an object that has a method (called acceptNode) which is capable of further filtering the node results before being returned from the iterator. This means that the function will be called on every applicable node (as specified by the previous whatToShow argument).
Two points to consider:
The filter argument must be an object with a property named 'acceptNode' that has a function as a value. It can't just be a function for filtering, it must be enclosed in a wrapper object. Update: Actually, this isn't true - at least with Mozilla's implementation you can pass in just a function. Thanks for the tip, Neil!
- The argument is required (even though you can pass in null, making it equivalent to accepting all nodes).
The last argument, entityReferenceExpansion, comes in to play when dealing with XML entities that also contain sub-nodes (such as elements). For example, with XML entities, it's perfectly valid to have a declaration like <!ENTITY aname "<elem>test</elem>"> and then later in your document have &aname; (which is expanded to represent the element). While this may be useful for XML documents it is way out of the scope of most web content (thus the argument will likely always be false).
So, in summary, createNodeIterator has four arguments:
- The first of which can be removed (by making the method available on elements, fragments, and documents).
- The second of which is obtuse and should be optional (especially in the case where all nodes are to be matched.
- The third which requires a superfluous object wrapping and should be optional.
- The fourth of which should be optional.
None of this actually takes into account the actual iteration process. If you look at the specification you can see that all the examples are in Java - and when seeing this a lot of the API decisions start to make more sense (not that it really applies to the world of web-based development, though). In JavaScript one doesn't really use iterators, more typically an array is used instead. (In fact a number of helpers have been added in ECMAScript 5 which make the iteration and filtering process that much simpler.)
I'd like to propose the following, new, API that would exist in place of the NodeIterator API (dramatically simplifying most common interactions, especially on the web).
// Get all nodes in the document
document.
getNodes();
// Get all comment nodes in the document
document.getNodes( Node.COMMENT_NODE );
// Get all element, comment, and text nodes in the document
document.getNodes( Node.ELEMENT_NODE, Node.COMMENT_NODE, Node.TEXT_NODE );
I'd also like to propose the following helper methods:
// Get all comment nodes in the document
document.
getCommentNodes();
// Get all text nodes in a document
document.getTextNodes();
Beyond finding elements, finding comments and text nodes are the two most popular queries types that I see requested.
Consider the code that would be required to recreate the above using NodeIterator:
// Get all nodes in the document
document.
createNodeIterator(document, NodeFilter.
SHOW_ALL,
null,
false);
// Get all comment nodes in the document
document.createNodeIterator(document, NodeFilter.SHOW_COMMENT, null, false);
// Get all element, comment, and text nodes in the document
document.createNodeIterator(document,
NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_COMMENT | NodeFilter.SHOW_TEXT,
null, false
);
This proposed API would return an array of DOM nodes as a result (instead of an NodeIterator object). You can compare the difference in results between the two APIs:
NodeIterator API
var nodeIterator = document.
createNodeIterator(
document,
NodeFilter.
SHOW_COMMENT,
null,
false
);
var node;
while ( (node = nodeIterator.nextNode()) ) {
node.parentNode.removeChild( node );
}
Proposed API
document.getCommentNodes().forEach(function(node){
node.parentNode.removeChild( node );
});
Another example, if we were to find all elements with a node name of 'A'.
NodeIterator API
var nodeIterator = document.
createNodeIterator(
document,
NodeFilter.
SHOW_ELEMENT,
{
acceptNode:
function(node
){
return node.
nodeName.
toUpperCase() ===
"A";
}
},
false
);
var node;
while ( (node = nodeIterator.nextNode()) ) {
node.className = "found";
}
Proposed API
document.getNodes( Node.ELEMENT_NODE ).forEach(function(node){
if ( node.nodeName.toUpperCase() === "A" )
node.className = "found";
});
Almost always, when finding some of the crazy intricacies of the DOM or CSS, you'll find a legacy of XML documents and Java applications - neither of which have a strong application to the web as we know it or to the web as it's progressing. It's time to divorce ourselves from these decrepit APIs and build ones that are better-suited to web developers.
Update: An even better alternative (rather than using constants representing node types) would be something like the following:
document.getNodes( Element, Comment, Text );
Just refer back to the back objects representing each of the types that you want.
Tags: dom, javascript, w3c
54 Comments on 'Unimpressed by NodeIterator'
February 12th, 2009
Last year I did some work on implementing a Selectors API Test Suite, which I've just updated to run in IE 8.
I've uploaded a copy of the suite here:
http://ejohn.org/apps/selectortest/
You can get the source here:
http://github.com/jeresig/selectortest/tree/master
For right now I'm getting the following result:
- WebKit Nightly 99.3% (16 failing - doesn't support complex :not() expressions)
- Firefox Nightly 99.3% (16 failing - doesn't handle 'undefined' being passed in, correctly)
- IE 8 RC 1 45.9% (1171 failing - Major problem areas are lack of whitespace trimming, incorrect exceptions being thrown, and lack of full CSS 3 selector support)
- Opera 10a1 99.0% (22 failing - Empty string checking in attributes fails and some disconnected checkbox checks fail)
I've also written about the improvements that querySelectorAll is bringing to web developers, along with some of the hardships associated with it.
Tags: dom, javascript, selectors, w3c
25 Comments on 'Selectors API Test Suite in IE 8'
November 24th, 2008
While looking for improvements to injecting HTML fragments into a document (which I mentioned, in passing, when I looked at using Document Fragments) I decided to spend some more time with Internet Explorer's insertAdjacentHTML method.
This method has been in Internet Explorer since version 4.0 - as well as is in the current release of Opera - and allows you to inject fragments of well-formed HTML into a variety of locations in a document.
The locations work as such (I list the equivalent terminology):
.insertAdjacentHTML("beforeBegin", ...) |
before |
.insertAdjacentHTML("afterBegin", ...) |
prepend |
.insertAdjacentHTML("beforeEnd", ...) |
append |
.insertAdjacentHTML("afterEnd", ...) |
after |
The method is only available on DOM elements (which makes sense) and is easy to use:
var ul = document.getElementById("list");
ul.insertAdjacentHTML("beforeEnd", "<li>A new li on the list.</li>");
ul.insertAdjacentHTML("beforeEnd", "<li>Another li!</li>");
At first glance the method appeared to work well and seemed to be relatively fast. Two questions remained, though: How fast is it in comparison to using the Document Fragment technique I outlined before and does it work for all the strange use-cases that exist?
- I created a test case to compare the three types of injection: The type we've been using in jQuery prior to the upcoming 1.3 release, the new Document Fragment technique we'll be using in jQuery 1.3, and a case using insertAdjacentHTML (where applicable). While both the Document Fragment and insertAdjacentHTML cases were significantly faster than the old techniques used in jQuery the Document Fragment technique ended up being marginally faster in IE 6 (50ms vs. 80ms for insertAdjacentHTML).
- There's a huge problem with insertAdjacentHTML: It doesn't work on all HTML elements in IE 6 (specifically it doesn't work on table, tbody, thead, or tr elements). Having gaps in the functionality is very undesirable (attempting to use insertAdjacentHTML on those elements causes an exception to pop up in IE 6).
- It doesn't work on XML documents. Of course neither does innerHTML (at least not until browsers start to implement HTML 5 more completely). We're stuck doing the traditional techniques used in libraries like jQuery.
So why spend all this time talking about a method that is relatively half-baked in the main browser that implements it? Because it's going to be part of the HTML 5 specification. This means that we're going to see a larger number of browsers start to implement this method (and hopefully it'll encourage existing vendors to implement it more completely and efficiently).
Having browsers implement this method will dramatically reduce the amount of code needed to write a respectable JavaScript library. I'm looking forward to the day in which this method is more-widely available (along with querySelectorAll) so that we can really buckle down and do some serious code simplification.
Tags: dom, w3c, html5, ie
14 Comments on 'DOM insertAdjacentHTML'
November 19th, 2008
The web is changing. Historically it's been painfully easy to request resources from remote locations (such as stylesheets, scripts, images, and loading pages in iframes) - but this has brought along a whole world of security issues that browsers are continuing to try and resolve.
This openness has come to define what web development is all about: Dead simple sharing of resources and ability to get started. It's very likely that the lack of restrictions placed on these historical page elements will continue to plague browser developers for many years to come.
That doesn't mean that browsers have to make the same "mistakes" going forward.
This is where the new W3C Access Control specification comes into play.
All the new cross-domain-capable technologies that are coming to browsers will be requiring the use of Access Controls from the get-go, including:
The Access Control specification has been one of the most-rapidly-changing specifications that I've seen. I wrote a demo early this year and have had to update it at least twice since then in order to match the updated APIs - and it appears as if they may have even changed again.
Right now the specification requires that any resource that you wish to make accessible in a cross-domain manner must include an extra header specifying which site(s) are allowed to access it.
If you wish to allow any site to access your resource you would use:
Access-Control-Allow-Origin: *
and if you only wanted one other domain to access it you would use:
Access-Control-Allow-Origin: http://ejohn.org
This is important: It now means that site owners must make a conscious decision to enable cross-domain access of their resources (in contrast to images, stylesheets, and scripts which are always made available cross-domain with no way to disable it).
(There are a number of other headers specified by the Access Control specification, for more fine-grained access.)
There is going to be a lot of confusion and anger regarding this large, fundamental, change to the style of these upcoming APIs: They aren't like the web that we know and love!
The best write-up that I've seen, to date, was by Jonas Sicking of Mozilla on the Ogg Mailing lists. This one section, in particular, is particularly poignant:
Why not use the same policy as for <img>?
Yes, we could definitely do the same for <video> as we have for <img>. But it will come with the same downsides. It will mean that we will have to be much more cautious with how we develop the API for
There are already discussions about API features that we could not allow if we allowed cross-site video without Access-Control (or similar) protection. We would not be allowed to have callbacks for captions where the captions are handed to page javascript to be displayed in the page. This would allow an internet site to get captions from board presentation videos hosted on intranet sites, something that is obviously not acceptable.
We could say that the captions callback would work, but only if the video was loaded from same site, or if had the Access-Control-Allow-Origin:* header set. However this will likely result in random bugs like captions sometimes failing since the developer had perfect hearing and so didn't do a lot of testing with captions. In general accessibility is hard enough to get people to do correct that I'm reluctant to add features that work great as long as you don't take accessibility into account, but where you have to take extra steps to get accessibility to work.
Similar arguments goes for accessing the size of the video file (for example through progress events). We can not allow that to work for cross-site loads unless the site has opted in. This is because we likely won't know that it's actually a video that is being downloaded until after the first progress events have been fired. This means that you could use <video> to measure file sizes for arbitrary files that are otherwise protected by firewalls and/or logins.
If we always restrict usage of <video> to the cases when we know that the video is private data we will be much more free to develop APIs and functionality since we won't have to worry about protecting the data inside it, or deal with error conditions when someone tries to use sensitive APIs from a cross-site loaded video that didn't have Access-Control-Allow-Origin:*.
I recommend that you take the time to read his whole piece as it's worth it to gain a full understanding of the problems at play here (especially related to the <video/> tag).
One thing is clear: Security is being addressed center-stage in the new web APIs. This is going to be good as it'll prevent horrible security bugs going forward while, at the same time, change the landscape of web development in a very fundamental way. The web had its fun but now reality is starting to set in - it's time to get to work.
I'm reminded of the recent release of a crazy hack: transmitting data via URL encoded strings in stylesheets, named CSSHttpRequest. It's an insane technique (in the best possible sense of the word) and well outside the realm of most users. Even though the syntax and technique is different, the security/information-leak implications of this are every-bit as real as those presented by JSONP.
Tags: browsers, w3c
3 Comments on 'The March of Access Control'
November 10th, 2008
Like many developers who had seen the work-in-progress CSS3 Layout specification I was immediately horrified. As one commenter on Reddit said: "Argh. ASCII-art drawing for columns?" which summarizes my initial feeling pretty well.
Now I felt that way until seeing this example from the CSS3 Layout spec document:
"a . b . c" /2em
". . . . ." /1em
"d . e . f"
". . . . ." /1em
"g . h . i" /2em
5em 1em * 1em 10em
I could immediately determine what the template was trying to do and how the document was going to look. Even if it is kind of crazy at first glance I'm dying for something like this to be implemented. To create an equivalent document using the CSS that we have now - or even Tables - would absolutely futile.
Even the syntax isn't that bad when you look at it. When examining the example I can see that there are three significant rows of content (two of which are 2em high and one of which will expand to fill the full height) and two spacer rows (each are 1em high). Thinking about how to implement something like this using normal CSS now makes my mind explode in frustration - especially in cross-browser manner.
So while this templating layout is still, very much, in the pipe dream category (no one will even touch it until IE implements it) I think it has a lot of merit and should be strongly examined - especially beyond the initial shock of the new syntax.
Honestly, this is just a goldmine waiting for some enterprising developer to come along and use the syntax to build a solution that'll work in all current browsers (maybe a server-side, or JavaScript, tool that'll process the template and inject the right style rules using a grid CSS framework).
Tags: css3, css, w3c
27 Comments on 'CSS3 Template Layout'
November 10th, 2008
A little while ago a nightly of Firefox 3.1 included support for the new Element Traversal API proposed by the W3C.
The purpose of this proposal is to make it easier for developers to traverse through DOM elements without having to worry about intermediary text nodes, comment nodes, etc. This has long been a bane of web developers, in particular, with cases like document.documentElement.firstChild yielding different results depending on the whitespace structure of a document.
The Element Traversal API introduces a number of new DOM node properties which can make this traversing much simpler.
Here's a full break-down of the existing DOM node properties and their new counterparts:
| Purpose |
All DOM Nodes |
Just DOM Elements |
| First |
.firstChild |
.firstElementChild |
| Last |
.lastChild |
.lastElementChild |
| Previous |
.previousSibling |
.previousElementSibling |
| Next |
.nextSibling |
.nextElementSibling |
| Length |
.childNodes.length |
.childElementCount |
These properties provide a fairly simple addition to the DOM specification (and, honestly, they're something that should've been in the specification to begin with).
There is one property that is conspicuously absent, though: .childElements (as a counterpart to .childNodes). This property (which contained a live NodeSet of the child elements of the DOM element) was in previous iterations of the specification but it seems to have gone on the cutting room floor at some point in the interim.
But all is not lost. Right now Internet Explorer, Opera, and Safari all support a .children property which provides a super-set of the functionality that was supposed to have been make possible by .childElements. When support for the Element Traversal API was finally landed for Firefox 3.1, support for .children was included. This now means that every major browser will support this property (far in advance of all browsers supporting the rest of the true Element Traversal specification).
I think that the Element Traversal spec is missing a huge opportunity here to specify something that has become a de facto standard amongst browsers. Maybe it'll make the second version of the Element Traversal spec, heh.
There are two big points that need to be explored here:
- Now that the
.children property is virtually everywhere how can we start to use it to simplify our code?
- Can we use
.children, or parts of the Element Traversal API, to help speed up existing code?
To answer this question I mocked up a quick little plugin for jQuery that replaces the internals of the existing .prev(), .next(), .prevAll(), .nextAll(), .siblings(), and .children() methods with .children and the Element Traversal API methods.
The resulting code is absolutely simpler - previously there were numerous checks to see if a Node was, or was not, a DOM element - which resulted in lots of extra kludgy methods to handle those cases. But was the code faster?
I plugged the code into Dromaeo to see if there was any speed up in Firefox 3.1. The result? There is no discernible speed improvement to using the new DOM Traversal methods (.firstChildElement, etc.). This isn't, necessarily, a bad thing - we just got the same performance that we see now but with a better API.
However there is a large improvement in speed when using .children (for the .siblings() and .children() jQuery methods). With this addition .siblings() is 84% faster and .children() is 35% faster. Considering that the .children method is now available in all browsers it's making a lot of sense for people to get on board and start using it in their code bases for a definite hot path to extra performance. (Although, this is definitely not a new revelation - with frameworks like Dojo having used .children in their selector code for quite some time now.)
If nothing else the argument to having a simple branch in your code to handle using .children is absolutely becoming more compelling.
Tags: w3c, javascript, dom
9 Comments on 'Element Traversal API'
August 9th, 2008
Decided to start something new with Dion Almaer (of Ajaxian.com fame) and Alex Russell (Dojo frame) - a podcast! We talk about the "Open Web" (the topics break down in to standards, development, JavaScript, tools - all sorts of things). We're working to get it up on iTunes (I'll be sure to update when it that's the case) but in the meantime you can subscribe to the RSS feed.
Dion has written up a nice overview of the first episode, which I've included below:
Welcome to the inaugural episode of a new podcast to cover news, happenings, and our opinions on the Open Web (download the Open Web Podcast episode one directly or subscribe to it). When I say “our” I am talking about the founding podcasters: Alex Russell, John Resig, and myself. It is a pleasure to be able to share air time with two of the real leaders of the Open Web, and specifically the Ajax space thanks to Dojo and jQuery.
What is the state of the Open Web?
That is how we started out the podcast, and we got to see very different opinions. John discusses the decentralization and new openness that we see across the Web. Alex was a little more wary, and talks about how he wants the Open Web to progress faster. He noted that a lot of the good work has been a little away from the client, and instead in the area of identity, transport, and formats.
We then move on to HTML 5, where we discuss items in Mark Pilgrim’s This Week in HTML 5 piece including Web Workers (think: Gears Workers), and the clarification of alt tag usage in the img tag to have you using alt="{diagram}" and the like.
We have a detailed chat about Web Workers, and where we see them being useful. John talks about issues around not being able to talk to the DOM, Alex talks about mashups, and I talk about some tests showing how they can help performance in a few areas. Matthew Russell did a demo using the Dojo 2d code at OSCON, and showed how he doubled the performance by pushing out computation into a Worker. John also talked about a special case for passing DOM fragments or the like to a Worker with special serialization. Of course, security is a concern for all of this.
John brought up the new data- embedding tactic that showed up in the HTML 5 spec. A conversation ensued around how you should separate your data from presentation. Is the DOM there to store data? Isn’t it a good place to keep it? Is “data-” just too long?
It is exciting to think that the W3C Selectors API will soon be implemented in Firefox 3.1, Safari 3, IE 8, and probably Opera 10. That seemed to happen pretty quickly. John and Alex talk about how this is going to mean a lot of chopping code from their frameworks, the increase in performance, and the subtle differences between the spec and how they were doing things.
The discussion leads to a new feature, named scoped CSS, that allows you to say “this CSS only works over here.” This could be huge, especially if you have an application such as a CMS, where people upload their own content that can mess with your application structure itself.
Next, we delve into the world of Firebug. John talks about how Firebug development is being bootstrapped by Mozilla and other contributors, and he discusses the upcoming versions and what you can expect. Stability and performance are top of the list. Don’t forget the Firebug Lite improvements too, which mean that you get more than just console to play with in non-Firefox browsers. I just posted the notes on that meeting, kindly taken by Steve Souders.
We talked about the Open Web Foundation, and Alex discussed what he would like to see come of it. He is optimistic, and thinks that the real test will be if we see the incubation of projects that really push the Web on the client side, as well as the identity side.
Finally, there is news in the Dojo community and Alex spills the beans. After over 4 years of service, Alex is stepping down as the project lead of Dojo, and handing over the reins to Peter Higgins who has shown great chops as both a commiter and an external leader. We wish Pete the best of luck! Alex isn’t sneaking off into the sunset though, as he talks about in his post on the subject, he will see be an active member of the Dojo community for a long time to come.
Finally, thanks again to John and Alex for taking the time to start this up with me. Please let us know what you think, and what you would like us to talk about.
Tags: html5, whatwg, w3c, podcast
10 Comments on 'Open Web Podcast #1'
July 10th, 2008
This week I've been busy working on implementing a test suite for the Selectors API specification. I picked up a new microphone recently so I decided to do a quick walkthrough of the work that I've been doing and how I've been going about it. You can view the the video below:

Implementing a Selectors API Test Suite
You can run the test suite for yourself here (it's still very much in flux - there are various parts that may still be wrong):
http://ejohn.org/apps/selectortest/
Here's a quick break down of a test run that was done earlier:
- Special Firefox 3.1 Build (73.8% - More details)
- Safari 3.1 (49.3% - No Fragment or Namespace support)
- WebKit Nightly (51% - No Fragment or Namespace support)
- Opera Gogi - "ACID3 Build" (76.7% - No Fragment support)
- IE 8 (Can't run - the file is proper XHTML so it tries to download it.)
- Firefox 3, Opera 9.5, IE 7 (0%)
The work that is being done to implement the specification in Firefox can be seen on its associated Bugzilla bug. I'm shooting very hard to make sure that everything is in place so that this makes it in to the upcoming Firefox 3.1 release (the first alpha of which is due out in a couple weeks). The benefits that this will have for both JavaScript libraries and their users will be tremendous.
Tags: css, firefox, mozilla, w3c
16 Comments on 'Implementing a Selectors API Test Suite'
·
« Previous entries