Blog


The State of JSON

I wanted to pull together some of the recent events that have occurred, related to native JSON support within a web browser, that should be of importance to many web developers. This should serve as a sort-of follow-up to my previous post: Native JSON Support is Required.

Early API Standardization Attempts - Last year, a number of attempts were made by the ECMAScript language committee to standardize an API for JSON encoding and decoding, within the language. A few API proposals were examined and discussed, most based upon Crockford's proposal, but no general consensus was reached. Some general issues with the proposed API were brought forward (similar to those mentioned in my previous post).

JSON2.js - Late last year Crockford quietly released a new version of his JSON API that replaced his existing API. The important difference was that it used a single base object (JSON) instead of extending all native object prototypes (booo!). Its revised API worked as such:

JSON.stringify({name: "John", location: "Boston"});
// => '{"name":"John","location":"Boston"}'
JSON.parse('{"name":"John","location":"Boston"}');
// => {name: "John", location: "Boston"}
 

This version of JSON.js is highly recommended. If you're still using the old version, please please upgrade (this one, undoubtedly, cause less issues than the previous one).

Dispersion - Even with a newly-proposed API, criticism concerning the inclusion of JSON in the ECMAScript language came to a questionable conclusion. Some implementors weren't interested in including it, others were, and yet other couldn't decide on the final API or method names. It was then decided that implementing JSON support in an ECMAScript implementation would be left up to the implementors themselves (unless, of course, some other conclusion is arrived upon in the future).

Mozilla Implements Native JSON - Mozilla was the first to implement native JSON support within it's browser. Note, however, that this is not a web-page-accessible API but an API that's usable from within the browser (and by extensions) itself. This was the first step needed to implement the API for further use.

Here is an example of it in use (works within an extension, for example):

var nativeJSON = Components.classes["@mozilla.org/dom/json;1"]
    .createInstance(Components.interfaces.nsIJSON);
nativeJSON.encode({name: "John", location: "Boston"});
// => '{"name":"John","location":"Boston"}'
nativeJSON.decode('{"name":"John","location":"Boston"}');
// => {name: "John", location: "Boston"}
 

Web-accessible Native JSON - The final, and most important, step is being worked on right now - a way to access native JSON encoding and decoding from web pages. How it'll be accessible is up to some debate (as having its naming conflict with an existing object would be a really bad thing). Regardless, there should be something within the browser by the time the Firefox 3 betas wrap-up.

What's important about this is that it's really not a case of "Oh well, guess we'll have to wait for other browsers to implement this." Since this is a native implementation (and, thus, very-very fast) existing JSON encoding/decoding libraries can just check for the existence of this particular set of functions and use them directly - gaining a direct, and immediate, performance boost for Firefox users. The same principle applies to features like getElementsByClassName, since they're available in normal JavaScript code, but are insanely fast when implemented directly by a browser.

Tags: json, javascript

Re-Securing JSON

Back in March/April of this year there was a lot of hub-bub concerning the discovery of a JSON data leak, or sorts. What it boils down to is "JavaScript is incredibly flexible, even to the degree of letting you redefine basic objects, like Array or Object itself."

For example, here's an exploit that works in Firefox 2, Opera 9, and Safari 3. It goes about redefining the global Array object then making it such that whenever a property value is set (even when the array is constructed!) the value is alerted out. In theory, a malicious script could use this technique to swipe data transmitted in JSON (via JSONP or even via an XHR+eval) and send it back to another server.

// From Joe Walker
function Array() {
  var obj = this;
  var ind = 0;
  var getNext = function(x) {
    obj[ind++] setter = getNext;
    if (x) alert("Data stolen from array: " + x.toString());
  };
  this[ind++] setter = getNext;
}
var a = ["private stuff"];
// alert("Data stolen from array: private stuff");
 

Around the time of that commotion, a bug was filed in the Mozilla bug tracker that begin to explore ways of fixing this issue. It was eventually decided that this was a specification issue and that global objects should not be able to be redefined, due to the inherent problems that they can cause. You can read more about this change, which will be a part of ECMAScript 4/JavaScript 2 in Section 1.4 of the ECMAScript 4 Incompatibilities paper [PDF].

To set about testing this new change, and bringing it into practice sooner rather than later, the Mozilla team implemented and committed a fix to be a part of Firefox 3 (and thusly, JavaScript 1.8). Well, that change landed last week and after a couple minor fires were put out, it made it into the final release of Firefox 3, Beta 2.

If you want to see the change in action, go and download a Firefox nightly and put something like this in the console:

function Array(){
  alert("hello, I found something of yours!");
}
// ERROR: redeclaration of const Array
 

You'll note that you now get the above error. This will also be the same for the following global objects:

Thus, if you attempt to redeclare any of those global objects (like I did above) you'll get the same error. Note that extending properties or prototypes of those objects have remained unchanged (they still work just fine) and this is a change that really shouldn't effect anyone (save for the malicious types!).

As always, should you spot something tricky, please feel free to file a follow-up bug to the original one (or if you need help localizing it and reproducing it, let me know).

Tags: xss, json, javascript, firefox, mozilla, security

Native JSON Support is Required

There's a JavaScript feature that I feel needs much more attention: Native JSON support in browsers.

It's something that should be a JavaScript language feature and yet no browser, or standards body, has defined, specified, or implemented what exactly it is, or should be. On top of this, it is one of the most (implicitly) requested language additions.

A recent ticket was opened for the Gecko engine, requesting a form of native JavaScript (de-)serialization whose API is based upon an implementation by Douglas Crockford. In effect, you'd be getting two additions to the JavaScript language: A .toJSONString() method on all objects (which will serialize that object to a string representation of itself) and a .parseJSON() method on all strings (which is responsible for deserializing a string representation of a JSON object).

I'd like to try and make the case for why I feel that native JSON support should standardized and made part of the JavaScript language.

1) The recommended implementation is considered harmful.

There are, currently, two popular means of transferring and deserializing JSON data. Each has their own advantages, and disadvantages. The two methods are:

  1. An implementation of JSON deserialization, by Douglas Crockford (which uses a JSON string, transferred using an XMLHttpRequest).
  2. And JSONP, which adds JavaScript markup around simple JSON data and is transferred (and deserialized) by injecting a <script> element into a document.

Currently, Crockford's json.js is the recommend means of deserializing JSON data. This has been discussed extensively elsewhere. Currently, json.js is better at keeping malicious code from being executed on the client (JSONP has no such protection), and thus, is recommend for most use cases.

However, json.js has taken two serious blows lately, which has moved it from being "recommend" to "harmful":

It's possible to covertly extract data from json.js-deserialized JSON data.

Joe Walker recently exposed a vulnerability which allows malicious users to covertly extract information from JSON strings that are deserialized using JavaScript's eval() statement. json.js currently makes use of eval(), making it vulnerable to this particular attack. This vulnerability has been discussed elsewhere too.

In order to fix this, json.js would need to use an alternative means of parsing and serializing the JSON-formatted string - a means that would considerably slower than the extremely-fast eval() statement.

It breaks the browser's native for..in method of iterating over object properties.

At this point, its pretty safe to say that that extending Object.prototype is considered harmful - and many users agree. Extending an Object's prototype is generally considered reasonable for personal situations, but for a publicly available, and highly recommended, script like json.js, it demands that it behave in a user-conscious manner.

Some attempts were recently made at cleaning up how json.js behaved, but thus far, no considerable effort has been made to provide an alternative means of deserializing JSON strings, that doesn't break a JavaScript engine's native behavior.

Summary: By adding support for JSON within a browser both of these issues will be completely circumvented (malicious users won't be able to extract data from a JSON structure, nor will the parseJSON and toJSONString methods be able to break for..in loops).

2) The recommend method of deserialization doesn't scale.

Let's start by looking at the two most popular methods of transferring JSON data (JSONP and an XMLHttpRequest transferring plain JSON), along with using an XMLHttpRequest to transfer some XML data (just for fun).

I've set up a series of tests that we can use to analyze the speed and efficiency of traditional JSON (using json.js), JSONP, and XML. I made 3 sets of files each containing a set of data records. As a base, I used some XML data from W3Schools. In the end, I came up with a total of 24 test files, each with a different number of records, in each specified format.

All of these get referenced from my mini test suites (one for each data type): JSON, JSONP, XML

These suites default to requesting the 50-record file 100 times, and taking an average reading. To get a reading on a different recordset, visit the file with the number of records in the URL, e.g.: json.html?400. Please don't run this on my server, it'll make it cry.

Instead, all of the test data and files can be downloaded here.

Note: I've only run these tests in Firefox, so caveat emptor.

Data Records by Time (in seconds)

Right off the bat, we can see two things:

  1. Transferring and de-serializing JSON data scales better than doing the same for equivalent XML data.
  2. JSONP scales better than the (more secure) XHR-requested, json.js-deserialized, JSON method.

Looking at the numbers for the recommended means of transferring JSON data, we don't get a full picture. Where are the scalability issues coming from? Maybe XMLHttpRequests don't scale so well? (Which would also help to explain the numbers for the XML transfers.)

To resolve this, let's break down the numbers for JSON into time spent processing and time spent transferring the data.

Data Records by Time (in seconds)

Here is where we see the major numbers come out. We can see that the processing time increases at a pace slightly less than O(n) time. This might be fine for most cases, however we can clearly see (from the previous chart) that JSONP is fully capable of faster parsing times.

Additionally, the processing time that json.js takes is completely blocking - no other operation is able to take place when the deserialization is taking place. When running the test suites you'll find that when the high (200-1600) record sets are processed your browser will stall (and if you're on a mac, you'll get the spinner of death). By passing this complete operation off to the browser you'll avoid all of these complications.

Summary By adding native JSON support within a browser you would have the blazing speed and scalability of JSONP, without adding any significant overhead.

Side Discussion:

In addition to studying the scalability of transferring JSON data, I've also looked at the overhead costs of pushing JSON data into an HTML structure (especially when compared to XML-formatted data).

Currently, some browsers have a native means of processing and converting XML data structures on the fly, using XSLT. XSLT is an incredibly powerful templating language and is more than capable of transforming any XML structure into an appropriate XHTML data structure.

However, for JSON data, no killer-templating system exists. I've used JSONT extensively but, in reality, it doesn't hold a candle to XSLT. (Both in terms of features and speed.)

I have two test suites one for JSONP + JSONT and another for XML + XSLT. You can download the full suite here.

Data Records by Time (in seconds)

You can see that, even with the extra speed advantages of JSONP, all of that lead is blown away by the incredibly slow nature of JSONT. Unfortunately, the situation isn't as clear-cut here as it was comparing JSONP and json.js, considering that JSONT is hardly a worthy replacement to XSLT.

Summary: In order for JSON to be, at all, competitive with XML/XSLT, there has to be a native means of transforming JSON data structures into strings (or DOM fragments), that can then be injected into an HTML document.

3) Upcoming standards imply its existance.

There are two upcoming standards that require some form of JavaScript object serialization (either explicitly, or implicitly).

The first, JSONRequest, (unsurprisingly, also from Douglas Crockford) is a means of securely transferring JSON data cross-domain. However, in order to implement this feature, some form of native JSON deserialization will be need to be implemented within the browser.

I know that Mozilla is considering implementing this feature, as is Microsoft, in Internet Explorer. Hopefully, this will mean that the two biggest browsers will have an implementation out quite soon.

The second is the new DOM Storage standard introduced by the WHATWG. While this feature does not require a form of JavaScript object serialization, it does only allow data to be stored as strings (and in order to make the most efficient use of that storage area, a form of object serialization will be needed).

However, the fact that two upcoming pseudo-standards need some form of object serialization requires us to re-examine the proposed API. It is imperative that it be rock-solid before being implemented in multiple platforms.

Here is the (rough) current API proposed by Crockford's json.js:

Object.prototype.toJSONString

var foo = { test: true, sample: "string" };
foo.toJSONString();
>> '{"test":true,"sample":"string"}'

Array.prototype.toJSONString

var foo = [ true, "string", 5 ];
foo.toJSONString();
>> '[true,"string",5]'

Boolean.prototype.toJSONString
String.prototype.toJSONString
Number.prototype.toJSONString

5.toJSONString();
>> '5'
"test".toJSONString();
>> '"test"'
true.toJSONString();
>> 'true'

However, there's still a lot of ambiguity in this particular API - especially when it comes to aspects of the language that aren't perfectly serializable.

Function and RegExp

Note that the parseJSON method (rightfully) balks at extracting a Function or a RegExp:

"function(){}".parseJSON();
ERROR: parseJSON: undefined
"/foo/".parseJSON();
ERROR: parseJSON: undefined

While it happily serializes either using toJSONString() (with varying results):

function test(){}
test.toJSONString()
>> "{"prototype":{}}"
/foo/.toJSONString()
>> "{}"

null

Also, note that while parsing a serialized 'null' gives you the correct value back:

"null".parseJSON()
>> null

it is unable to convert a null into its serialized form:

var foo = null;
foo.toJSONString()
ERROR: foo has no properties

A very important point needs to be made here: Some form of definition and specification should be made regarding this language addition. And soon. Even if it's nothing other than defining that the above behavior is correct - that should be specified somewhere for all browser developers to follow.

Summary: The current, recommended, implementation of JSON parsing and serialization is harmful and slow. Additionally, upcoming standards imply that a native JSON (de-)serializer already exists. Therefore, browsers should be seriously looking at defining a standard for native JSON support, and upon completion implement it quickly and broadly.

To get the ball rolling, I recommend that you vote up the Mozilla ticket on the implementation, to try and get some critical eyes looking at this feature, making sure that it's completely examined and thought through; and included in a browser as soon as possible.


Update: I located the specification that defines support for toJSONString/parseJSON in ECMAScript 4. This is great news. Now that the definition for this feature is happily moving along, it's just time to get some implementations going!

Tags: browsers, mozilla, xml, firefox, jsonp, javascript, json

JSON and RSS

I've been bit by the JSON bug. For a long time now, I've simply shrugged it off as 'Why not just use XML, it's parsable by most languages anyway.' However, once I started playing around with the del.icio.us JSON interface, then the Google Homepage API, and finally with the new Yahoo! JSON API - I realized that they were really on to something. The major benefits are immediately apparent:

  • It's incredibly lightweight - there's almost no extra markup, which keeps the data transfers nice and small.
  • There's very little overhead needed to parse it, since it's pure Javascript to begin with (and a number of other languages can either handle it as-is, or have a module to parse it).
  • and, probably most importantly, you can use it in a cross-domain environment. This exists due to the fact that you can execute remote Javascript (aka a JSON object), no matter what domain you're coming from. You can now completely skip the previously necessary (for XML) proxies.

So, this brings me to my first project using JSON - a fast RSS to JSON Convertor. You simply plug in the URL of the RSS (or Atom) feed that you wish to convert - and you'll have a nice, plyable JSON object to work with. I cache all retreived files every hour, to save on bandwidth, so please be aware of that. It also supports the addition of callbacks, making it easy to use in your program, right out of the box. If you're interested in seeing a demo, along with some sample code (and the code of the convertor), feel free to visit the project page.

The nice thing about having a RSS to JSON Convertor is that you can now convert any RSS source and play with it instantly - for example, your Google Search History, the Latest TV Listings, or even your POP Email Account. The possibilities are endless. I can't wait to play with this some more.

Tags: json, rss, javascript, programming, xml

Google Homepage API

Yesterday, I sat down and played around with the new Google Homepage API, which is interesting, in and of itself. I found the development to be most like developing a widget (for Dashboard or Konfabulator).

A couple observations:

  • By default, your module is contained within a fixed height IFrame, but it's possible to actually embed your widget straight into the Google Homepage itself.
  • My first worry was over the possibility of XSS attacks, but all modules run on a different domain, gmodules.com. (I'm not sure what happens if you embed it in the page, my guess is that they're far more restrictive, if you want your module to run free like that)
  • The have a server-side proxy that's on the same domain as the modules - which means that you can do cross-domain XMLHttpRequests - a very smart move (at least from a developers perspective, not sure about security, though).

My first test module is rather simple, it's just the current list of links from del.icio.us popular, auto-updating every hour. To run it for yourself, go to your Google Homepage, click the 'Add Content' link and enter the following URL into the 'Create A Section' textfield:
http://ejohn.org/apps/igdel/
If you're worried about running foreign modules on your homepage, you can feel free to look at the source code - it's completely harmless.

The majority of the code, for the frontend of the module, was borrowed from two places:

The final bit, that made this module work, I'll discuss tomorrow - it's a dynamic RSS to JSON convertor, that's incredibly cool. (If you're feeling adventurous, you can look at the module source code and find it hidden in there.)

Tags: javascript, json, programming, api, homepage, google, rss

Current Projects

jQuery JavaScript Library

jQuery

Comprehensive DOM, Event, Animation, and Ajax JavaScript Library.

Recent Projects

Pro JavaScript Techniques

JavaScript Book

The best techniques for professional JavaScript. Published by Apress.