Blog


Ruby VM in JavaScript

Welcome Waxy.org and Slashdot readers. I blog about JavaScript, like it's my job, feel free to subscribe for a ton more posts like this.

Related Posts:

Note: I'm not the creator of HotRuby, as mentioned elsewhere - it is the work of a highly-skilled Japanese developer.

HotRuby is a project which aims to port the Ruby Virtual Machine over to ECMAScript (allowing it to run, directly, in a browser using JavaScript or indirectly using ActionScript in Flash).

Currently the code works by using Ruby 1.9's YARV (Yet Another Ruby VM) to compile down a Ruby script into opcodes, which are then serialized and passed to the browser for execution. It's a little bit indirect but it still capable of creating a compelling result.

If you were to run one of the examples in your browser the actual chain of execution would be something like this:

  1. Script finds <script type="text/ruby"></script> tags and extracts the inline Ruby code from them.
  2. The Ruby code is sent to the server via an XMLHttpRequest.
  3. The server-side CGI script (in Ruby, using Ruby 1.9) compiles the incoming Ruby into its associated opcodes and serializes it into a JSON data structure.
  4. The browser consumes the opcodes, translating it into JavaScript, and executes it.

To observe this full process we can take a look at the code in the provided benchmark and watch its full path through the server and final execution:

startTime = Time.new.to_f
 
sum = ""
50000.times{|e| sum += e.to_s}
 
endTime = Time.new.to_f
puts (endTime - startTime).to_s + ' sec'

For example, observe this portion of the CGI script:

#!/usr/local/bin/ruby
# Requires Ruby 1.9.0
# The license of this source is "Ruby License"

require 'json'
require 'cgi'

cgi = CGI.new

puts "Content-type: text/plain\n\n"
puts VM::InstructionSequence.compile(cgi['src'], "src", 1, {}).to_a.to_json

and a sample of the opcode data returned by the server:

["YARVInstructionSequence\/SimpleDataFormat",1,1,1,{"arg_size":0,"local_size":4,"stack_max":3},"","src","top",["startTime","sum","endTime"],0,[["break",null,"label_21","label_29","label_29",0]],[2,["putnil"],["getconstant","Time"],["send","new",0,null,0,null],["send","to_f",0,null,0,null],["setlocal",4],4,["putstring",""],["setlocal",3],"label_21",5,["putobject",50000],["send","times",0,["YARVInstructionSequence\/SimpleDataFormat",1,1,1,{"arg_size":1,"local_size":1,"stack_max":2},"block in ","src","block",["e"],[1,[],0,0,-1,-1,3],[["redo",null,"label_0","label_22","label_0",0],["next",null,"label_0","label_22","label_22",0]],["label_0",5,["getdynamic",3,1],["getdynamic",1,0],["send","to_s",0,null,0,null],["send","+",1,null,0,null],["dup"],["setdynamic",3,1],"label_22",["leave"]]],0,null],"label_29",["pop"],7,["putnil"],["getconstant","Time"],["send","new",0,null,0,null],["send","to_f",0,null,0,null],["setlocal",2],9,["putnil"],8,["getlocal",2],["getlocal",4],["send","-",1,null,0,null],["send","to_s",0,null,0,null],["putstring"," sec"],9,["send","+",1,null,0,null],["send","puts",1,null,8,null],8,["leave"]]]

and you can find the full client-side virtual machine here: HotRuby.js.

Perhaps most fascinating about this, though, is the speeds that are able to be achieved with this script. Granted, the above benchmark is rather contrived, but the end performance results are quite fascinating:

Firefox 3.0b5 2.47s
Firefox 2 6.71s
Ruby 1.8.2 12.25s

We can see a 2.71x speed improvement from Firefox 2 to Firefox 3 and a 5x performance improvement over regular Ruby 1.8.2, running on the command-line.

It's a fascinating time to be working with JavaScript. The performance improvements that are being provided to us by the browser afford us a realm of possibility that wasn't, previously, viable. The fact that we're even discussing running a virtual machine, implemented in JavaScript, is quite impressive. I'm curious to see what applications end up being built with this implementation - and within what context they end up using it.

Tags: javascript, ecmascript, ruby, vm

ES4 Implementation Update

The development of ECMAScript 4 is moving into an important phase: the implementors are making good on their word and are starting to implement the ECMAScript 4 proposals. Many of the features have been well thought out by this point and the implementors are working hard to integrate the necessary changes into their engines.

A couple important pieces are coming along but the most critical of which is the ECMAScript 4 Reference Implementation. They've released a second milestone release. You can find a copy of the implementation on the ECMAScript download page.

An important list of features is starting to become available in the reference implementation:

Implemented, may have bugs:

  • classes and interfaces
  • namespaces
  • pragmas
  • let, const, let-const
  • iterators
  • enumerability control
  • type expressions / definitions / annotations
  • runtime type checks ("standard mode")
  • nullability
  • destructuring assignment
  • slice syntax
  • hashcode
  • catchalls
  • map & vector
  • date & time improvements
  • meta objects
  • static generics
  • string trim
  • typeof
  • globals
  • expression closures
  • name objects
  • type operators (is / to / cast / wrap)

Implemented and partly working, but still in flux / work to do:

  • inheritance checking
  • strict mode
  • type parameters
  • structural types
  • numbers & decimal
  • getters & setters (structural part is incomplete)
  • packages

Now a full feature list is also being worked on by all of the implementors (as I mentioned previously). This list is going to serve as the starting point for many implementors especially when they decide which features to implement.

Adobe has also taken a step and has outlined (note the column with green/red/etc.) where they stand on all of the ECMAScript 4 proposals. They also took the time to outline their position on the proposals that they're (currently) declining to implement.

This is a really important step in the development of the language. The implementors are staking their ground and are working hard to make sure that a solid language comes out at the end - especially one that is universally implemented. Both Google and Apple have also been participating the ECMAScript 4 mailing list, asking a lot of good questions, as they look towards creating their own ES4 implementations (in Rhino and WebKit, respectively).

Pretty much everyone can agree that a lack of dialog between implementors would surely cause problems - but that does not appear to be the case, here. Because of this openness and solid dialog ECMAScript 4 looks to have a strong future.

Tags: javascript, ecmascript

Tracing JavaScript

Chris Double, Mozilla contributor and excellent programmer, posted a great overview of what the new JavaScript/ActionScript runtime is going to be like in upcoming versions of Mozilla Firefox and Adobe Flash Player. Specifically, how the new tracing portions of the compiler work. I got permission from him to re-post it here - enjoy!


I attended the Tamarin Tech summit at Adobe on Friday. My main interest for attending was to learn more about the tamarin-tracing project. The goal of Tamarin is to produce a high performance ECMAScript 4 implementation.

'Tamarin Tracing' is an implementation that uses a 'tracing jit'. This type of 'just in time compiler' traces code executing during hotspots and compiles it so when those hotspots are entered again the compiled code is run instead. It traces each statement executed, including within other function calls, and this entire execution path is compiled. This is different from compiling individual functions. You can gain more information for the optimizer to operate on, and remove some of the overhead of the calls. Anytime the compiled code makes a call to code that has not been jitted, the interpreter is called to continue.

Apparently the JIT for Lua is also being written using a tracing jit method and a post by Mike Pall describes the approach they are taking in some detail and lists references. A followup post provides more information and mentions Tamarin Tracing.

'Tamarin Tracing' is open source and can be obtained from the mercurial repository:

$ hg clone http://hg.mozilla.org/tamarin-tracing/
To build the source you create a directory to hold the build files, change to it, and run the configure script:
$ $ mkdir mybuild
$ cd mybuild
$ ../tamarin-tracing/configure --enable-shell
$ make
The 'enable-shell' option is required to produce the 'avmshell' binary that executes the bytecode. At the end of the build you'll see the avmshell binary in the shell subdirectory:
$ shell/avmshell
avmplus shell 1.0 build cyclone
...
'avmshell' operates on files containing bytecode not JavaScript. To use it you'll need to have a front end that compiles JavaScript to the 'abc' bytecode format it uses. The bytecode is the ActionScript bytecode. You'll need a compiler that generates this. This can be obtained from the Flex SDK. This is a free download from Adobe. You can also use any other tool that generates the correct bytecode.

Included with Tamarin Tracing is the source for 'esc'. This is a work-in-progress implementation of an ECMAScript 4 compiler written in ECMAScript. It generates the 'abc' bytecode but is (I think) not quite ready for prime time. In this post I'm using the 'asc' compiler from the Flex 2 SDK on Linux. This compiler is written in Java and is in the 'lib/asc.jar' file in the SDK.

A quick test that the avmshell program works:

$ echo 'print(\'hello world!\');' >>hello.as
$ java -jar asc.jar hello.as
hello.abc, 86 bytes written
$ shell/avmshell hello.abc
hello world!
'avmshell' has a number of debugging options that are only available when configuring the build with '--enable-debugger'. This allows you to get some information about the trace jit. Here's the build process with a debug enabled build and the available options:
$ mkdir mybuild
$ cd mybuild
$ ../tamarin-tracing/configure --enable-shell --enable-debugger
$ make
$ shell/avmshell
avmplus shell 1.0 build cyclone
...
To demonstrate some of the output I'll use a simple fibonacci benchmark. This is the contents of fib.as:
function fib(n) {
 if(n <= 1)
  return 1;
 else
  return fib(n-1) + fib(n-2);
}
	
print('fib 30 = ' + fib(30));
A comparison of times with and without the tracing jit enabled:
$ shell/avmshell -lifespan -interp fib.abc
fib 30 = 1346269
Run time was 26249 msec = 26.25 sec
$ shell/avmshell -lifespan fib.abc
fib 30 = 1346269
Run time was 1967 msec = 1.97 sec

There's a lot of other interesting stuff in the Tamarin Tracing source that I hope to dive into. For example:

  • the interpreter is written in Forth. There are .fs files in the 'core' subdirectory that contains the Forth source code. Each 'abc' bytecode is implemented in lower level instructions which are implemented in Forth. The tracing jit operates on these lower level instructions. The system can be extended with Forth code to call native C functions. The compiler from Forth to C++ is written in Python and is in 'utils/fc.py'
  • The jit has two backends. One for Intel x86 32 bit, and the other for ARM. See the 'nanojit' subdirectory.
  • The complete interpreter source can be rebuilt from the Forth using 'core/builtin.py'. This requires 'asc.jar' to be placed in the 'utils' subdirectory of Tamarin Tracing.
At the summit there was an in-depth session of the internals of the Forth code and how to extend it. I'll write more about that later when/if I get a chance to dig into it.

Tags: ecmascript, actionscript, javascript

Acid 3 Tackles ECMAScript

Update: The Acid3 test is now final, I've updated the blog post to reflect this.

The new news on the block is that the upcoming Acid 3 test is in the oven, starting to get baked. Traditionally, the Acid test has served as a way to get browser vendors in line by testing them on really-annoying edge cases. This can, sometimes, get people tied up in knots but it actually serves as a devious way of getting people to meet a large part of a spec.

For example, in order for a browser to have some weird padding/margin test case solved - in CSS - they must also have a working box model. So while an Acid test may not, explicitly, test for a working box model, it will be done implicitly (by testing edge cases that result from it).

With that in mind, it's time to take a look at Acid 3 which primarily focuses on technology that I find to be interesting: ECMAScript and the DOM. Let's dig in and see what exactly is being tested - specifically, relating to ECMAScript.

  • Array Elisions - Making sure that stuff like [,,] has a length of 2 and [0,,1] has a length of 3.
  • Array Methods - Doing an unshift with multiple arguments .unshift(0, 1, 2), joining with an undefined argument .join(undefined).
  • Number Conversion - Banging against .toFixed(), .toExponential(), and .toPrecision() - especially with decimals and negative numbers.
  • String Operations - Negative indicies in substr .substr(-7, 3), character access by index "foo"[1] (part of the ECMAScript 4 spec).
  • Date - Making sure that certain method calls result in NaN results (like d.setMilliseconds(), with no arguments) and also enforcing +1900 year offsets.
  • Unicode in Identifiers - You can't use escaped Unicode in identifiers, for example: eval("test.i\\u002b= 1;"); (that should throw an exception).
  • Regular Expressions - /[]/ matches an empty set, /[])]/ should throw an exception, backreferences to non-existent captures, and negative lookaheads /(?!test)(test).exec("test test").
  • Enumeration - Make sure that object properties are enumerated in the correct order, make sure that you're able to enumerate properties of certain names (toString, hasOwnProperty, etc.).
  • Function Constructors - The user should be able to set custom constructors on the .constructor property, .constructor should not be enumerable, and .prototype.constructor should be deletable.
  • Function Expressions - (function test(){ ... })(); You should be able to call the function by name, within the function itself, you can't directly overwrite the function name (only with a function-scoped variable), and 'test' isn't leaked into the parent scope.
  • Exception Scope - Variables within the catch(){} should interact with the catch arguments primarily, followed by variables in an outer scope.
  • Assignment Expressions - s = a.length = "123"; - a.length has a return value of 123 (the number) which is assigned to 's', rather than the correct result of the string "123".
  • Encoding - encodeURI() and encodeURIComponent() must gracefully handle null bytes.

All-in-all it's a comprehensive smattering of weird ECMAScript edge cases - you're bound to find at least one that fails in your favorite browser-of-choice. I'm sure we'll see many more test cases coming in, in the upcoming days in weeks.

I'm looking forward to seeing the final results - and the competitive heat that's been applied to CSS-spec implementors being applied to ECMAScript implementors.

For kicks, here's the current results in a bunch of major browsers (including the correct reference rendering).

These are the preliminary results of the UNCOMPLETED Acid 3 test in UNCOMPLETED versions of major browsers - take with a grain of salt. Go here to view the final version of this test.

Reference Rendering:

Firefox 2:

Firefox 3b2:

Safari 3:

WebKit Nightly:

Opera 9.5b1:

Internet Explorer 7: (thanks chunghe!)

Tags: testing, ecmascript, bugs

State of ECMAScript 4 (Dec '07)

I've just completed my first survey of the current ECMAScript 4 implementations. I went through and attempted to compile as many bugs and features as possible, as stated by the ECMAScript 4 specification and double-check them against all the actively-maintained implementations. You can view a nice overview below.

I think it's fascinating to note that there's 3 implementations that already have over 25% of all the new features in the language implemented.

View: The raw data

About the implementations:

ECMAScript 4 Reference Implementation (ES4 RI)

This is the reference implementation provided by the ECMA technical group, as a reference for those creating their own implementations.

Tamarin

Tamarin is the joint effort of Mozilla and Adobe to adapt the Open Source Adobe Virtual Machine to match ECMAScript 4 - and run in Firefox 3.next() (via ActionMonkey) and Flash 10.

Update from Tom:

Tamarin VM by itself doesn't directly support ECMAScript source code. Rather, the subproject esc (written in ECMAScript 4) compiles ECMAScript 4 to abc bytecode that is run by the Tamarin VM.

Spidermonkey

Spidermonkey is the JavaScript engine currently in Firefox (and other Mozilla-based projects). It's being actively updated with new features to match the ES4 specification. This project will, most likely, be superseded by ActionMonkey.

Rhino

Rhino is a Java implementation of JavaScript which is currently being updated to meet the ECMAScript 4 specification.

Futhark (Opera)

Futhark is the JavaScript engine that is a part of Opera 9.5 (Kestrel) and will be a part of Opera 10 (Peregrine). It's being actively updated to match the ECMAScript 4 specification.

Mbedthis

Mbedthis has used Javascript as a web scripting language in its AppWeb embedded web server product for several years. More recently, they have been updating the language for use in mobile devices and has developed a C and Java VM for hosting Javascript widget style applications to run in standard features phones. They are tracking ES4 and are upgrading their implementation as the spec is finalized. They are planning to release a test version late Q1 2008 that will implement most of the planned features in ES4. This will be dual license: open source and commercial.

Tags: ecmascript, javascript, ecmascript4, javascript2

The World of ECMAScript

So I did a little bit of digging and I've pulled together something fun: I call it "The World of ECMAScript".


(Released under the GPL v2 [SVG])

It's a full map detailing everything that exists within the world of ECMAScript (with JavaScript, ActionScript, and JScript being its most-famous implementations). Right now I'm only showing things that can be built on top of (languages, engines, browsers, servers, etc.) - not end user applications (there would probably be too many to list).

This chart started out as a simple diagram showing the relationship between ActionScript, Tamarin, ActionMonkey, and SpiderMonkey. From there I started tacking on additional relationships and it just sort of started to grow out of control. I'm fascinated by the size and breadth of everything in that exists in the ECMAScript ecosystem (and this isn't even everything, I'm sure I'm missing a ton).

Here's some links for more information:

Languages:

Engines:

Applications:

Hooks/Convertors:

Companies:

Implementation Languages:

Let me know if there's anything that you feel that I've missed. I'll use my discretion when adding, simply because I don't want this to include every half-baked ECMAScript implementation under the sun (and I still have to modify it by hand).

Update 3am Nov. 15: Removed WebKit (was redundant), added Silverlight, added IronPython and IronRuby, connected PDF to SpiderMonkey, and fixed spelling of Konqueror. Presto is wrong for Opera, but not sure what their JS Engine is named. Compressed PNGs, added an SVG download.

Update 5pm Nov. 15: Turned JavaScript into a language/cloud. Added ParenScript, YHC/JavaScript, Haxe, and Scheme2JS. Added CouchDB. Silverlight now links to JScript. Opera's two engines (futhark and linear_b) are listed. Added Flex. Changed QSA to QT Toolkit.

Tags: programming, ecmascript, javascript, browsers

Playing with ECMAScript 4

Say you want to start playing around with the new ECMAScript 4 syntax, getting a feel for the code and the features that'll be included (which is what I've been doing lately). Here a short screencast that shows you some ways that you can go about doing that.

Click video to begin (8:40 Minutes long, 11MB):


Download: Right-click this link and select Save As… in order to download a copy of your own. (11MB)

Step 1: Grab some materials.

The best resources, currently, for looking at the syntax and brief examples of how the code should look are:

  • ECMAScript 4 White Paper - This is a broad overview of all the features that are in the language along with some trivial examples of how they should work.
  • Tamarin and ECMAScript 4 Presentation - This is the presentation that I've given a couple times now showing examples of many of the aspects of the languages. Generally, the examples shown are more complete than those shown in the white paper.

Step 2: Get the reference implementation.

Now, you'll need to snag a copy of the ECMAScript 4 Reference Implementation. This is a standalone runtime that you can use from your console. It is not connected to the browser in any way (thus, you won't have access to any of the typical browser things - like 'window', 'document', 'XMLHttpRequest', or 'setTimeout'/'setInterval'). I'm currently working on an implementation of these details and will announce when they're ready for general use.

Step 3: Start coding!

Start with some simple ECMAScript 3 commands, to test the waters:

>> var test = 'test';
>> test
test
>> test = 'new value';
new value
>> test
new value
>> function stuff(){ print(test); }
>> stuff();
new value

Now move on to some basic use of Type Annotations (included in ECMAScript 4):

>> var age : int = 23;
>> age
23
>> age = 45;
45
>> age = 'John';
**ERROR** TypeError: incompatible types w/o conversion
>> var name : string = 'John';
>> name = 23;
23
>> age = 3.0;
3

and here's a basic class example, borrowed from this presentation that I gave.

class Programmer {
  var name;
  var city = 'Boston, MA';
  const interest = 'computers';
  function work() {}
}

Save the above to a file like 'Programmer.es' and load into the console with:

intrinsic::load('Programmer.es');

then you can play around with that class some more:

>> Programmer
[class Class]
>> var p = new Programmer();
>> p.city
Boston, MA
>> p.name = 'John';
John
>> p.interest = 'science';
science
>> p.interest
computers

If you have any questions, or are having trouble getting started, please feel free to ask.

Caveats!

There's still a lot of functionality that hasn't been implemented yet, in the reference implementation. In my current tests I've noticed the following missing features:

  • Map/Vector
  • Multiple constructors
  • Private constructors
  • Program units
  • Initializers
  • Protected Class properties
  • nullability
  • "like"/"wrap"

Again, if you have any problems, please don't hesitate to ask and we can step through it together.

Tags: javascript, mozilla, ecmascript

ECMAScript 4 Speaking Tour

During the past two weeks I've given three presentations on Tamarin and ECMAScript 4. I've gotten a ton of great feedback, criticism, and commentary - all of which has been very helpful.

» Tamarin and ECMAScript 4

Here's a quick re-cap of how the talks went:

Ajax Experience East (The Future of JavaScript)

There were some very smart questions asked by the audience here - and some pressing concerns. Although, the theme of "being concerned" was a large one throughout the conference (this was right about the time of the white paper release and ensuing blog kerfuffle). That being said, of those that attended my talks, I was able to help alleviate most of their initial doubts (such as towards backwards compatibility, the new type system, or the complexity of the language). It was pretty easy to spot conference attendees who did not attend my talk as those questions were raised again during the ensuing panel discussions. I generally found that those who were at the talk were able to get up to speed pretty quickly; understanding most of the changes and being excited about when they could start to test them.

Adobe Max Japan (Tamarin and ECMAScript 4)

This was a really unique speaking situation for me - talking to a large room of ActionScript developers about the future of their language. I only have a cursory knowledge of ActionScript so I was able to gloss over some of the details of ES4 (since they already have type annotations, classes, and packages). That being said, I got some really fantastic questions. Considering that these developers have already been using a large subset of ECMAScript 4 for close to 1.5 years it was great to hear the sort of concerns that they had.

Overwhelmingly, of the developers and Adobe employees that I talked to, everyone seems to love the new changes that were introduced in ActionScript 3 - and they're all looking forward to the ECMAScript 4-based updates. It's interesting to see where the ActionScript community has gone, as it's a good indicator of where the JavaScript community will lead once JavaScript 2 is out the door.

Mozilla Japan/Shibuya.JS (The Future of JavaScript)

This talk was the most fun, out of the three. I was primarily presenting to members of Shibuya.JS (the only JavaScript user group in the world) and they were very excited and asked lots of good, hard, questions. Probably their biggest concern was over the "expressiveness" of the language and if that would be maintained into the next version.

Members of Shibuya.JS streamed the talk that I gave to them over ustream - you can find recorded copies below:

» The Future of JavaScript (Video)

» The Future of JavaScript - Lightning Talks and Q&A (Video)

After the talks a number of us went out for dinner and it was great fun. We talked JavaScript, jQuery, and ECMAScript for many hours - frequently just writing code on paper to talk to each other (JavaScript being the universal language).

I hope I can make it back to Tokyo soon (Gen has a longer recap of my trip up) and be able to visit Shibuya.JS as well.

Seeing them in action has given me a serious itch to start up a Boston.JS group.

Tags: javascript, conferences, travel, tokyo, ecmascript, mozilla

Next entries » · « Previous entries

JavaScript Books

Secrets of the JavaScript Ninja

JavaScript Secrets

Secret techniques of top JavaScript programmers.

Pro JavaScript Techniques

Pro JavaScript

The best techniques for professional JavaScript. Published by Apress.

Micro Updates

John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.

JavaScript Jobs



Hosting provided by: Ruby Hosting by Engine Yard