Node.js Stream Playground


This summer I had the opportunity to attend NodeConf and it was a fantastic experience. I really appreciated how every session was a hands-on coding session: I felt like I walked away knowing how to put a bunch of advice directly into practice.

One of my favorite sessions was the one run by James Halliday and Max Ogden exploring Node streams. Specifically they sat us down and had us run through the amazing Stream Adventure program. It’s an interactive series of exercises that are designed to help you understand streams better. Seeing interesting ways of using streams in practice was quite eye opening for me. (I also recommend checking out the awesome Stream Handbook if streams are totally foreign to you)

I had made limited use of streams before (namely with the request and fs modules) but I was still early on. Over the past few months I’ve spent more time researching Node modules that make good use of streams and have tried to apply them in my various side projects. In theory I absolutely love streams: they are the chainable piping amazingness that exists with UNIX and I aspired to achieve with jQuery. In practice many Node stream modules are poorly written, poorly documented, out of date, or just obscure. So while theoretically using Node streams should be really easy in practice it becomes quite challenging.

In my explorations I was starting to build up a pile of streaming Node modules that I *knew* worked correctly (or, at least, as I expected them to). Namely: They supported streams in Node 0.10+ and they supported being .pipe()‘d to and from (where applicable). I realized that in a perfect world Node streams truly are like Lego bricks – it becomes quite easy to snap a few of them together to build a chained flow for your data to pass through.

With that in mind I built an exploration that I hope will help others to understand the simplicity and power of Node streams: The Node.js Stream Playground.

More information:

The Node.js Stream Playground was created to help Node.js developers better understand how streams work by showing a number of use cases that are easily plug-and-play-able. Additionally detailed logging is provided at every step to help users better understand what events the streams are emitting and exactly what their contents are.

How to Use the Playground

I hope you’ll get the most out of the playground by exploring concepts for yourself – looking at what happens when you pipe different streams to each other, looking at the data being logged, and attempting to understand (for yourself) what exactly is happening and why. Note that all of the code being generated is valid Node code and can be copy-and-pasted and run on your own computer (assuming you have the appropriate NPM modules installed).

I should note that this isn’t the end-all-and-be-al of Node stream education: There is still a lot to learn with regard to error handling, back pressure, and all the intricate stream concepts that exist. I hope to write some more on these concepts some day.

With that being said here are some good examples of some of the actions that you can do:

Copying a File

  1. Select Read File and input/people.json.
  2. Select Write File and output/people.json.

The resulting code:

  1. var fs = require("fs");
  2.  
  3. // Read File
  4. fs.createReadStream("input/people.json")
  5.     // Write File
  6.     .pipe(fs.createWriteStream("output/people.json"));

This will copy the JSON file from one location to another. This is actually the preferred way of copying files in Node – there actually is no built-in utility method for doing this.

Downloading a File

This will download the JSON file from the specified URL and save it to the local file.

  1. Select HTTP Get Request and http://nodestreams.com/input/people.json.
  2. Select Write File and output/people.json.

The resulting code:

  1. var request = require("request");
  2. var fs = require("fs");
  3.  
  4. // HTTP GET Request
  5. request("http://nodestreams.com/input/people.json")
  6.     // Write File
  7.     .pipe(fs.createWriteStream("output/people.json"));

Un-Gzipping a File

  1. Select Read File and input/people.csv.gz.
  2. Select Un-Gzip.
  3. Select Write File and output/people.csv.

The resulting code:

  1. var fs = require("fs");
  2. var zlib = require("zlib");
  3.  
  4. // Read File
  5. fs.createReadStream("input/people.csv.gz")
  6.     // Un-Gzip
  7.     .pipe(zlib.createGunzip())
  8.     // Write File
  9.     .pipe(fs.createWriteStream("output/people.csv"));

Note that the initial data that is coming out of the Read File is just an array of numbers – this is to be expected. In reality it’s a Node Buffer holding all of the raw binary data. It’s not until after we Un-Gzip the file that we start to have data that’s in a more usable form.

Converting a TSV to HTML

In this case we’re going to manually parse the TSV (without headers) just to show you some of the string operations provided by event-stream.

  1. Select Read File and input/people.tsv.
  2. Select Split Strings.
  3. Select Split Strings into Array.
  4. Select Convert Array w/ Sprintf (with the default table row HTML).
  5. Select Join Strings (this will insert an endline between all the individual table rows).
  6. Select Concat Strings (this will combine all the individual table rows and endlines into a single string).
  7. Select Wrap Strings (with the default table HTML).
  8. Select Write File and output/people.html.

The resulting code:

  1. var fs = require("fs");
  2. var es = require("event-stream");
  3. var vsprintf = require("sprintf").vsprintf;
  4.  
  5. // Read File
  6. fs.createReadStream("input/people.tsv")
  7.     // Split Strings
  8.     .pipe(es.split("\n"))
  9.     // Split Strings into Array
  10.     .pipe(es.mapSync(function(data) {
  11.         return data.split("\t");
  12.     }))
  13.     // Convert Array w/ Sprintf
  14.     .pipe(es.mapSync(function(data) {
  15.         return vsprintf("<tr><td><a href='%2$s'>%1$s</a></td><td>%3$s</td></tr>", data);
  16.     }))
  17.     // Join Strings
  18.     .pipe(es.join("\n"))
  19.     // Concat Strings
  20.     .pipe(es.wait())
  21.     // Wrap Strings
  22.     .pipe(es.mapSync(function(data) {
  23.         return "<table><tr><th>Name</th><th>City</th></tr>\n" + data + "\n</table>";
  24.     }));

This should give you some HTML that looks something like this:

  1. <table><tr><th>Name</th><th>City</th></tr>
  2. <tr><td><a href='http://vinceallen.com/'>Vince Allen</a></td><td>Brooklyn, NY</td></tr>
  3. <tr><td><a href='http://twitter.com/jandet'>Janessa Det</a></td><td>New York, NY</td></tr>
  4. <tr><td><a href='http://patnakajima.com/'>Pat Nakajima</a></td><td>San Francisco, CA</td></tr>
  5. <tr><td><a href='http://sarajchipps.com/'>Sara Chipps</a></td><td>New York, NY</td></tr>
  6. <tr><td><a href='http://ejohn.org/'>John Resig</a></td><td>Brooklyn, NY</td></tr>
  7. </table>

Download Encoded, Gzipped, CSV and Convert to HTML

This is the big one:

  1. Select HTTP Get Request and http://nodestreams.com/input/people.csv.gz.
  2. Select Un-Gzip (note that some of the characters are encoded incorrectly, we need to fix this!).
  3. Select Change Encoding (even though it comes out as a buffer, this is correct, as we’ll see in a moment).
  4. Select Parse CSV as Object.
  5. Select Convert Object w/ Handlebars (with the default table row HTML).
  6. Select Join Strings (this will insert an endline between all the individual table rows).
  7. Select Concat Strings (this will combine all the individual table rows and endlines into a single string).
  8. Select Wrap Strings (with the default table HTML).
  9. Select HTTP PUT Request and http://nodestreams.com/output/people.html.

The resulting code:

  1. var request = require("request");
  2. var zlib = require("zlib");
  3. var Iconv = require("iconv").Iconv;
  4. var csv = require("csv-streamify");
  5. var Handlebars = require("handlebars");
  6. var es = require("event-stream");
  7.  
  8. var tmpl = Handlebars.compile("<tr><td><a href='{{URL}}'>{{Name}}</a></td><td>{{City}}</td></tr>");
  9.  
  10. // HTTP GET Request
  11. request("http://nodestreams.com/input/people_euc-jp.csv.gz")
  12.     // Un-Gzip
  13.     .pipe(zlib.createGunzip())
  14.     // Change Encoding
  15.     .pipe(new Iconv("EUC-JP", "UTF-8"))
  16.     // Parse CSV as Object
  17.     .pipe(csv({objectMode: true, columns: true}))
  18.     // Convert Object w/ Handlebars
  19.     .pipe(es.mapSync(tmpl))
  20.     // Join Strings
  21.     .pipe(es.join("\n"))
  22.     // Concat Strings
  23.     .pipe(es.wait())
  24.     // Wrap Strings
  25.     .pipe(es.mapSync(function(data) {
  26.         return "<table><tr><th>Name</th><th>City</th></tr>\n" + data + "\n</table>";
  27.     }))
  28.     // HTTP PUT Request
  29.     .pipe(request.put("http://nodestreams.com/output/people.html"));

You should have the same output as the last run. A lot is happening here but even will all of these steps Node streams still makes it relatively easy to complete it all. It’s at this point that you can truly start to see the power and expressiveness of streams.

Adding in New Streams

If you’re interested in extending the playground and adding in new pluggable stream “blocks” you can simply edit blocks.js and add in the stream functions. A common stream block would look something like this:

  1. "Change Encoding": function(from /* EUC-JP */, to /* UTF-8 */) {
  2.    var Iconv = require("iconv").Iconv;
  3.    return new Iconv(from, to);
  4. },

The property name is the full title/description of the stream. The arguments to the function are variables that you wish the user to populate. The comments immediately following the argument names are the default values (you can provide multiple values by separating them with a |).

The streams are split into 3 types: “Readable”, “Writable”, “Transform” (in that they read content, modify it, and pass it through). Typically it is assumed that “Readable” streams will be the first ones that you can choose in the playground, “Writeable” streams will end the playground, and everything else is just “.pipe()able”.

If you’ve added a new stream please send a pull request and I’ll be happy to add it!

Running Your Own Server

After downloading the code (either from Github or NPM). Be sure to run npm install to install all the dependencies. You can then use just node app.js to run a server – or if you wish to run something more robust you can install naught and then run npm start.

WARNING I have no idea how robust the application’s security is. This application is generating and executing code on the user’s behalf (although it is not allowing arbitrary code to be executed). Feel free to run it on a local server or, if you feel confident in the code, run it on your own server. At the moment I’m running it on a standalone server with nothing else on it, just in case.

Feedback Welcome!

I’d love to hear about how people are using streams and if this tool has been helpful for your understanding how streams work. Let me know if there are particular stream modules that you really like and – if possible – try and add them to the stream playground so that others can experience them as well!

Posted: November 15th, 2013


If you particularly enjoy my work, I appreciate donations given with Gittip.

8 Comments (Show Comments)



Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.


Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

Ukiyo-e Database and Search

Ukiyo-e.org

Japanese woodblock print database and search engine.


John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.