Native JSON Support is Required


There’s a JavaScript feature that I feel needs much more attention: Native JSON support in browsers.

It’s something that should be a JavaScript language feature and yet no browser, or standards body, has defined, specified, or implemented what exactly it is, or should be. On top of this, it is one of the most (implicitly) requested language additions.

A recent ticket was opened for the Gecko engine, requesting a form of native JavaScript (de-)serialization whose API is based upon an implementation by Douglas Crockford. In effect, you’d be getting two additions to the JavaScript language: A .toJSONString() method on all objects (which will serialize that object to a string representation of itself) and a .parseJSON() method on all strings (which is responsible for deserializing a string representation of a JSON object).

I’d like to try and make the case for why I feel that native JSON support should standardized and made part of the JavaScript language.

1) The recommended implementation is considered harmful.

There are, currently, two popular means of transferring and deserializing JSON data. Each has their own advantages, and disadvantages. The two methods are:

  1. An implementation of JSON deserialization, by Douglas Crockford (which uses a JSON string, transferred using an XMLHttpRequest).
  2. And JSONP, which adds JavaScript markup around simple JSON data and is transferred (and deserialized) by injecting a <script> element into a document.

Currently, Crockford’s json.js is the recommend means of deserializing JSON data. This has been discussed extensively elsewhere. Currently, json.js is better at keeping malicious code from being executed on the client (JSONP has no such protection), and thus, is recommend for most use cases.

However, json.js has taken two serious blows lately, which has moved it from being “recommend” to “harmful”:

It’s possible to covertly extract data from json.js-deserialized JSON data.

Joe Walker recently exposed a vulnerability which allows malicious users to covertly extract information from JSON strings that are deserialized using JavaScript’s eval() statement. json.js currently makes use of eval(), making it vulnerable to this particular attack. This vulnerability has been discussed elsewhere too.

In order to fix this, json.js would need to use an alternative means of parsing and serializing the JSON-formatted string – a means that would considerably slower than the extremely-fast eval() statement.

It breaks the browser’s native for..in method of iterating over object properties.

At this point, its pretty safe to say that that extending Object.prototype is considered harmful – and many users agree. Extending an Object’s prototype is generally considered reasonable for personal situations, but for a publicly available, and highly recommended, script like json.js, it demands that it behave in a user-conscious manner.

Some attempts were recently made at cleaning up how json.js behaved, but thus far, no considerable effort has been made to provide an alternative means of deserializing JSON strings, that doesn’t break a JavaScript engine’s native behavior.

Summary: By adding support for JSON within a browser both of these issues will be completely circumvented (malicious users won’t be able to extract data from a JSON structure, nor will the parseJSON and toJSONString methods be able to break for..in loops).

2) The recommend method of deserialization doesn’t scale.

Let’s start by looking at the two most popular methods of transferring JSON data (JSONP and an XMLHttpRequest transferring plain JSON), along with using an XMLHttpRequest to transfer some XML data (just for fun).

I’ve set up a series of tests that we can use to analyze the speed and efficiency of traditional JSON (using json.js), JSONP, and XML. I made 3 sets of files each containing a set of data records. As a base, I used some XML data from W3Schools. In the end, I came up with a total of 24 test files, each with a different number of records, in each specified format.

All of these get referenced from my mini test suites (one for each data type): JSON, JSONP, XML

These suites default to requesting the 50-record file 100 times, and taking an average reading. To get a reading on a different recordset, visit the file with the number of records in the URL, e.g.: json.html?400. Please don’t run this on my server, it’ll make it cry.

Instead, all of the test data and files can be downloaded here.

Note: I’ve only run these tests in Firefox, so caveat emptor.

Data Records by Time (in seconds)

Right off the bat, we can see two things:

  1. Transferring and de-serializing JSON data scales better than doing the same for equivalent XML data.
  2. JSONP scales better than the (more secure) XHR-requested, json.js-deserialized, JSON method.

Looking at the numbers for the recommended means of transferring JSON data, we don’t get a full picture. Where are the scalability issues coming from? Maybe XMLHttpRequests don’t scale so well? (Which would also help to explain the numbers for the XML transfers.)

To resolve this, let’s break down the numbers for JSON into time spent processing and time spent transferring the data.

Data Records by Time (in seconds)

Here is where we see the major numbers come out. We can see that the processing time increases at a pace slightly less than O(n) time. This might be fine for most cases, however we can clearly see (from the previous chart) that JSONP is fully capable of faster parsing times.

Additionally, the processing time that json.js takes is completely blocking – no other operation is able to take place when the deserialization is taking place. When running the test suites you’ll find that when the high (200-1600) record sets are processed your browser will stall (and if you’re on a mac, you’ll get the spinner of death). By passing this complete operation off to the browser you’ll avoid all of these complications.

Summary By adding native JSON support within a browser you would have the blazing speed and scalability of JSONP, without adding any significant overhead.

Side Discussion:

In addition to studying the scalability of transferring JSON data, I’ve also looked at the overhead costs of pushing JSON data into an HTML structure (especially when compared to XML-formatted data).

Currently, some browsers have a native means of processing and converting XML data structures on the fly, using XSLT. XSLT is an incredibly powerful templating language and is more than capable of transforming any XML structure into an appropriate XHTML data structure.

However, for JSON data, no killer-templating system exists. I’ve used JSONT extensively but, in reality, it doesn’t hold a candle to XSLT. (Both in terms of features and speed.)

I have two test suites one for JSONP + JSONT and another for XML + XSLT. You can download the full suite here.

Data Records by Time (in seconds)

You can see that, even with the extra speed advantages of JSONP, all of that lead is blown away by the incredibly slow nature of JSONT. Unfortunately, the situation isn’t as clear-cut here as it was comparing JSONP and json.js, considering that JSONT is hardly a worthy replacement to XSLT.

Summary: In order for JSON to be, at all, competitive with XML/XSLT, there has to be a native means of transforming JSON data structures into strings (or DOM fragments), that can then be injected into an HTML document.

3) Upcoming standards imply its existance.

There are two upcoming standards that require some form of JavaScript object serialization (either explicitly, or implicitly).

The first, JSONRequest, (unsurprisingly, also from Douglas Crockford) is a means of securely transferring JSON data cross-domain. However, in order to implement this feature, some form of native JSON deserialization will be need to be implemented within the browser.

I know that Mozilla is considering implementing this feature, as is Microsoft, in Internet Explorer. Hopefully, this will mean that the two biggest browsers will have an implementation out quite soon.

The second is the new DOM Storage standard introduced by the WHATWG. While this feature does not require a form of JavaScript object serialization, it does only allow data to be stored as strings (and in order to make the most efficient use of that storage area, a form of object serialization will be needed).

However, the fact that two upcoming pseudo-standards need some form of object serialization requires us to re-examine the proposed API. It is imperative that it be rock-solid before being implemented in multiple platforms.

Here is the (rough) current API proposed by Crockford’s json.js:

Object.prototype.toJSONString

  1. var foo = { test: true, sample: "string" };
  2. foo.toJSONString();
  3. >> '{"test":true,"sample":"string"}'

Array.prototype.toJSONString

  1. var foo = [ true, "string", 5 ];
  2. foo.toJSONString();
  3. >> '[true,"string",5]'

Boolean.prototype.toJSONString
String.prototype.toJSONString
Number.prototype.toJSONString

  1. 5.toJSONString();
  2. >> '5'
  3. "test".toJSONString();
  4. >> '"test"'
  5. true.toJSONString();
  6. >> 'true'

However, there’s still a lot of ambiguity in this particular API – especially when it comes to aspects of the language that aren’t perfectly serializable.

Function and RegExp

Note that the parseJSON method (rightfully) balks at extracting a Function or a RegExp:

  1. "function(){}".parseJSON();
  2. ERROR: parseJSON: undefined
  1. "/foo/".parseJSON();
  2. ERROR: parseJSON: undefined

While it happily serializes either using toJSONString() (with varying results):

  1. function test(){}
  2. test.toJSONString()
  3. >> "{"prototype":{}}"
  1. /foo/.toJSONString()
  2. >> "{}"

null

Also, note that while parsing a serialized ‘null’ gives you the correct value back:

  1. "null".parseJSON()
  2. >> null

it is unable to convert a null into its serialized form:

  1. var foo = null;
  2. foo.toJSONString()
  3. ERROR: foo has no properties

A very important point needs to be made here: Some form of definition and specification should be made regarding this language addition. And soon. Even if it’s nothing other than defining that the above behavior is correct – that should be specified somewhere for all browser developers to follow.

Summary: The current, recommended, implementation of JSON parsing and serialization is harmful and slow. Additionally, upcoming standards imply that a native JSON (de-)serializer already exists. Therefore, browsers should be seriously looking at defining a standard for native JSON support, and upon completion implement it quickly and broadly.

To get the ball rolling, I recommend that you vote up the Mozilla ticket on the implementation, to try and get some critical eyes looking at this feature, making sure that it’s completely examined and thought through; and included in a browser as soon as possible.


Update: I located the specification that defines support for toJSONString/parseJSON in ECMAScript 4. This is great news. Now that the definition for this feature is happily moving along, it’s just time to get some implementations going!

Posted: March 6th, 2007


If you particularly enjoy my work, I appreciate donations given with Gittip.

22 Comments (Show Comments)



Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.


Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

Ukiyo-e Database and Search

Ukiyo-e.org

Japanese woodblock print database and search engine.


John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.


via Ad Packs