Blog


XHTML, document.write, and Adsense

After some recent discussion concerning the use of document.write() in XHTML documents served with the doctype "application/xhtml+xml" I decided to revisit the problem. An issue with the solutions proposed by Sam and Ajaxian is that they aren't really solutions - just a lot of hand waving (not that that's bad, it's just that the problem is a lot harder than what they propose).

So I sat down and decided to write a semi-complete document.write() replacement for Firefox 1.5+, Opera 9, and Safari 2+ - all handling straight XHTML documents served with an "application/xhtml+xml" content-type.

Note: Notice that I completely ignore Internet Explorer. Since IE doesn't even know to render XHTML pages (served with the correct mimetype), I'm assuming that you're doing some form of browser sniffing in your code (on the server). If that's the case, then you may be serving a different version of the page, and not include the (at this point) unnecessary document.write() hack. If you want to serve only one version of the code, then I suggest that you use conditional comments, or do some client-side browser sniffing to serve the hack to those that need it. (This is mostly because I have yet to find a way to reliably detect a broken document.write() implementation.)

I had a couple of goals for my solution:

  1. It should be as faithful to the normal document.write() as possible. (This means arbitrary injection of XHTML into the DOM)
  2. It should inject the XHTML into the document at the current DOM position.
  3. It should correct for basic weird things that people do (like using write to add invalid XHTML to a document - and writing out closing tags). Stuff like this:
    document.write("<iframe src='test.html'>");
    // ... some code ...
    document.write("</iframe>");
  4. It should make Google Adsense work, with no code modification.

I'll start by saying that solving this problem in Mozilla "isn't that bad" nor is it in Opera. Safari is a royal PITA, which I'll talk about, more, later.

The vast majority of the cross-browser issues that occur relate to how innerHTML works in XHTML documents. In order to make document.write() work as you would expect it to, you need to write out straight (X)HTML. This topic has been discussed extensively by some of the great JavaScript and standards developers in the industry.

A Solution

So I've developed a basic solution to the document.write()/XHTML problem. The full code for which can found found below, along with a demo of it in action here:
http://ejohn.org/apps/write.xhtml

document.write = function(str){
    var moz = !window.opera && !/Apple/.test(navigator.vendor);
       
    // Watch for writing out closing tags, we just
    // ignore these (as we auto-generate our own)
    if ( str.match(/^<\//) ) return;

    // Make sure & are formatted properly, but Opera
    // messes this up and just ignores it
    if ( !window.opera )
        str = str.replace(/&(?![#a-z0-9]+;)/g, "&amp;");

    // Watch for when no closing tag is provided
    // (Only does one element, quite weak)
    str = str.replace(/<([a-z]+)(.*[^\/])>$/, "<$1$2></$1>");
       
    // Mozilla assumes that everything in XHTML innerHTML
    // is actually XHTML - Opera and Safari assume that it's XML
    if ( !moz )
        str = str.replace(/(<[a-z]+)/g, "$1 xmlns='http://www.w3.org/1999/xhtml'");
       
    // The HTML needs to be within a XHTML element
    var div = document.createElementNS("http://www.w3.org/1999/xhtml","div");
    div.innerHTML = str;
       
    // Find the last element in the document
    var pos;
       
    // Opera and Safari treat getElementsByTagName("*") accurately
    // always including the last element on the page
    if ( !moz ) {
        pos = document.getElementsByTagName("*");
        pos = pos[pos.length - 1];
               
        // Mozilla does not, we have to traverse manually
    } else {
        pos = document;
        while ( pos.lastChild && pos.lastChild.nodeType == 1 )
            pos = pos.lastChild;
    }
       
    // Add all the nodes in that position
    var nodes = div.childNodes;
    while ( nodes.length )
        pos.parentNode.appendChild( nodes[0] );
};

It's important to note what this solution does - and does not - work for.

  • The code will work perfectly for well-formed XHTML markup. This code only does basic "crappy HTML" checks. For example, if you do: document.write("<img src='foo.jpg'>") it'll correct it to become XHTML compliant (with the extra / at the end). However, doing document.write("<img src='foo.jpg'> <img src='bar.jpg'>"); will break - as only the last element in the document.write() is "fixed". (And even then, the fixing isn't very smart - it just adds a closing tag, which may not always be correct.) Much of this can be fixed with some smarter regular expressions. I took a stab at it, but cross-browser support for variable negative lookaheads seems to be shaky, at best.
  • When using innerHTML in an XML document in Opera and Safari, it assumes that all elements are just XML elements. For this reason the code forcefully puts all elements in the XHTML namespace. Again, this is pretty crude and may break some of your markup, but it's worked well for me so far.
  • The only extra purification that's performed is the conversion of ampersands (&) to their entity code (&amp;) - where appropriate. If you have other symbols (like < or >, then I can't make any guarantees.)
  • It's also interesting to note that two completely different methods of traversing the document had to be used. Mozilla-based browsers start acting really strange when you do getElementsByTagName("*") inline in an XHTML document. It will always work fine for the first document.write(), but all subsequent calls will revert back to the position of the last inline <script/>.
  • In the end, this is still not as good as document.write() since with .write() you can write out stuff like table rows, options, partial HTML, script elements, all without blinking an eye. The code to handle all of this is quite significant (having written the code to do it for jQuery, you can take my word for it). I don't plan on re-writing all of that special-case code, so please only use this solution for simple fixes.

Ok, so now that that's out of the way - let's see how well this works in the different browsers.

Firefox 1.5+ Opera 9 Safari 2 Webkit
(Safari 3)
Simple Text Insertion Pass Pass Fail Pass
Simple HTML Insertion Pass Pass Fail Pass
Google Adsense Pass Pass Fail Sort-of Fail

So here's the dirt on Safari. I spent many hours banging my head against the keyboard and finally admitted defeat in Webkit for Adsense and anything in Safari 2.0. Here's the issues:

  • Safari 2.0 completely rejects any attempts to use innerHTML in an XHTML document. It throws exceptions and simply will not let you do it. For this reason, Safari (as it is currently available) is a lost cause.
  • Webkit Nightlites (Safari 3.0) on the other hand, fixed the innerHTML problems - allowing it to work nearly flawlessly. You can see that on the demo page (in a Webkit Nightly) that the Google Adsense IFrame is inserted into the page - and a URL is even requested - however the Adsense script seems to be fundamentally flawed. Looking at the URL generated for Webkit vs. the URL generated for Firefox or Opera, it is apparent that the Adsense script simply isn't working correctly. So while, technically, Adsense does not (currently) work in the Webkit Nightly, with this hack, it seems like it's not by a fault of mine.

In all, this hack was an interesting experience - considering that every browser seems to behave in some sort of nonsensical fashion (in one way or another). I'm glad that there's, at least, a solution now for two of the major browsers (and possibly the next version of Safari too, after some more tinkering). I was, perhaps, most pleasantly surprised by Firefox's innerHTML/XHTML implementation. You feed it valid XHTML, it inserts it into the document. Any other value throws an exception. Very simple and logical.

As a side note: I'm going to try and feed some of this code back into jQuery, so that stuff like $(...).append("") will work as you might expect it to in the major browsers.

It's pretty obvious that writing XHTML documents with the preferred mimetype is still a ways off from real-world usage, however I'm more hopeful now than I was before - which is good, to say the least.

Tags: google, javascript, adsense, xhtml

Reboot!

So, I decided to give my site a quick redesign for the May 1st CSS Reboot. It should be noted that I’m not a web designer, but just a geeky programmer with a thing for standards-based design. I hope you enjoy my entry and my work.

When planning out the redesign of my site, I wanted to emphasize my different interests, without having a typical ‘boring’ blog and/or link list.

To do this, I grouped together all of my blog posts and links, sorted by date, then figured out when I had ‘spurts’ of interest in something. For example, I might be interested in Ruby for a couple days – but then may not revisit it for a couple weeks (which I attempt to emphasize on my site).

I also saw this as a great opportunity to show the power of the Javscript-enhanced Canvas element to draw my main pie chart. So no, that isn’t a gif or flash, it’s Javascript.

I also wanted to really emphasize the projects that I’m working on. I did this by adding a piece of constant navigation (in the right-hand sidebar) which always points to them.

So, I hope you like the new design – I’m fairly pleased with it. Some of the fonts in the blog posts could use some work, but other than that – I think it’s fine.

Happy Rebooting!

Tags: css, design, web, xhtml, html, reboot, cssreboot

Wanted: Javascript/Design Guru

I'm looking for a Javascript/Design Guru to help get some applications off the ground.

If you can look at Prototype (http://prototype.conio.net/) and say "That's cool, but I can do better!", or you can scoff at the designs on Stylegala (http://stylegala.com/) and CSS Vault (http://cssvault.com/) - then this job is for you!

Knowledge of the following is a must:
- Advanced Object Oriented Javascript
- Ajax (Asynchronous Javascript and XML)
- XHTML
- Advanced, Modern, Design Capabilities
- Advanced CSS

Big Plus:
- Adobe Illustrator
- Knowledge of Microformats
- Knowldge of Social Networks

If you are primarily Javascript OR Design-Oriented - let me know, as I may still have some work for you.

This job will be on a part-time/contract basis - most likely telecommuting. Pay will determined by your skill level and previous experience.

Apply for this Job Here OR email me.

Tags: illustrator, html, xml, xhtml, design, css, jobs, ajax, javascript, adobe

Scope of Microformats

I've been doing a lot of work with Microformats, recently, but have hit a stumbling block: scoping. According to the reltag specification, scoping is possible:

rel="tag" is specifically designed for "tagging" content, typically web pages (or portions thereof, like blog posts)
Source: reltag

For example, here's a chunk of code borrowed from ideaShrub:

<li class="shrub">
  <h2><a href="">...</a></h2>
  <div class="date">...</div>
  <p class="desc">...</p>
  <p>
    <img src=""/>
    <strong>Tags:</strong>
    <a href="" rel="tag">ideashrub</a>
    <a href="" rel="tag">documentation</a>
  </p>
  ...
</li>

For this block, the appropriate scope for the two tags is within the 'li' element - but how can I specify that? For all some application knows, the 'scope' of those tags is within the 'p' element - or maybe the tags are related to the page as a whole. Why isn't this specified anywhere? How should scoping be handled - am I missing something?

On the other hand, if you look at the hcard microformat, they seem to be a little bit clearer by saying that a card is wrapped in:

<div class="hcard">...</div>

Which makes sense. Maybe there needs to be some sort of generic 'object' or 'item' microformat - you could use reltag, xfn, and even hcard all together to describe the object at hand - it just needs a proper scoping wrapper to make it possible. Should I be looking at RDF for this sort of issue, or am I just overlooking something?

Tags: reltag, microformat, microformats, xhtml

JavaScript Books

Secrets of the JavaScript Ninja

JavaScript Secrets

Secret techniques of top JavaScript programmers. Coming Fall 2009.

Pro JavaScript Techniques

Pro JavaScript

The best techniques for professional JavaScript. Published by Apress.

Micro Updates

John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.

JavaScript Jobs



Hosting provided by: Ruby Hosting by Engine Yard