What is Node.js & why do I care?

At its simplest, Node.js is server-side JavaScript. JavaScript is a popular programming language, but it almost always runs inside a web browser. So JavaScript can, for instance, manipulate the contents of this page by being included inside <script> tags, but it doesn’t get to play around with the files on our computer or tell our server what HTML to send like the PHP that runs this WordPress blog.

Node is interesting for more than just being on the server side. It provides a new way of writing web servers while using an old UNIX philosophy. Hopefully, by the end of this post, you’ll see its potential and how it differentiates itself from other programming environments and web frameworks.

Hello, World

To start, let’s do some basic Node programming. Head over to nodejs.org and click Install.1 Once you’ve run the installer, a node executable will be available for you on the command line. Any script you pass to node will be interpreted and the results displayed. Let’s do the classic “hello world” example. Create a new file in a text editor, name it hello.js, and put the following on its only line:

console.log('Hello, world!');

If you’ve written JavaScript before, you may recognize this already. console.log is a common debugging method which prints strings to your browser’s JavaScript console. In Node, console.log will output to your terminal. To see that, open up a terminal (on Mac, you can use Terminal.app while on Windows both cmd.exe and PowerShell will work) and navigate to the folder where you put hello.js. Your terminal will likely open in your user’s home folder; you can change directories by typing cd followed by a space and the subdirectory you want to go inside. For instance, if I started at “C:\users\me” I could run cd Documents to enter “C:\users\me\Documents”. Below, we open a terminal, cd into the Documents folder, and run our script to see its results.

$ cd Documents
$ node hello.js
Hello, world!

That’s great and all, but it leaves a lot to be desired. Let’s do something a little more sophisticated; let’s write a web server which responds “Hello!” to any request sent to it. Open a new file up, name it server.js, and write this inside:

var http = require('http');
http.createServer(handleRequest).listen(8888);
function handleRequest (request, response) {
  response.end( 'Hello!' );
}

In our terminal, we can run node server.js and…nothing happens. Our prompt seems to hang, not outputting anything but also not letting us type another command. What gives? Well, Node is running a web server and it’s waiting for responses. Open up your web browser and navigate to “localhost:8888″; the exclamation “Hello!” should appear. In four lines of code, we just wrote an HTTP server. Sure, it’s the world’s dumbest server that only says “Hello!” over and over no matter what we request from it, but it’s still an achievement. If you’re the sort of person who gets giddy at how easy this was, then Node.js is for you.

Let’s walk through server.js line-by-line. First, we import the core HTTP library that comes with Node. The “require” function is a way of loading external modules into your script, similar to how the function of the same name does in Ruby or import in Python. The HTTP library gives us a handy “createServer” method which receives HTTP requests and passes them along to a callback function. On our 2nd line, we call createServer, pass it the function we want to handle incoming requests, and set it to listen for requests sent to port 8888. The choice of 8888 is arbitrary; we could choose any number over 1024, while operating systems often restrict the lower ports which are already in use by specific protocols. Finally, we define our handleRequest callback which will receive a request and response object for each HTTP request. Those objects have many useful properties and methods, but we simply called the response object’s end method which sends a response and optionally accepts some data to put into that response.

The use of callback functions is very common in Node. If you’ve written JavaScript for a web browser you may recognize this style of programming; it’s the same as when you define an event listener which responds to mouse clicks, or assign a function to process the result of an AJAX request. The callback function doesn’t executive synchronously in the same order you wrote it in your code, it waits for some “event” to occur, whether that event is a click or an AJAX request returning data.

In our HTTP server example, we also see a bit of what makes Node different from other server-side languages like PHP, Perl, Python, and Ruby. Those languages typically work with a web server, such as Apache, which passes certain requests over to the languages and serves up whatever they return. Node is a server, it gives you low-level access to the inner workings of protocols like HTTP and TCP. You don’t need to run Apache and have requests sent to Node: it handles them on its own.

Who cares?

Some of you are no doubt wondering: what exactly is the big deal? Why am I reading about this? Surely, the world has enough programming languages, and JavaScript is nothing new, even server-side JavaScript isn’t that new.2 There are already plenty of web servers out there. What need does Node.js fill?

To answer that, we must revisit the origins of Node. The best way to understand is to watch Ryan Dahl present on the impetus for creating Node. He says, essentially, that other programming frameworks are doing IO (input/output) wrong. IO comes in many forms: when you’re reading or writing to a file, when you’re querying databases, and when you’re receiving and sending HTTP requests. In all of these situations, your code asks for data…waits…and waits…and then, once it has the data, it manipulates it or performs some calculation, and then sends it somewhere else…and waits…and waits. Basically, because the code is constantly waiting for some IO operation, it spends most of its time sitting around rather than crunching digits like it wants to. IO operations are commonly the bottlenecks in programs, so we shouldn’t let our code just stop every time they perform one.

Node not only has a beneficial asynchronous programming model but it has developed other advantages as well. Because lots of people already know how to write JavaScript, it’s started up much quicker than languages which are entirely new to developers. It reuses Google Chrome’s V8 as a JavaScript interpreter, giving it a big speed boost. Node’s package manager, NPM, is growing at a tremendous rate, far faster than its sibling package managers for Java, Ruby, and Python. NPM itself is a strong point of Node; it’s learned from other package managers and has many excellent features. Finally, other programming languages were developed to be all-purpose tools. Node, while it does share the same all-purpose utility, is really intended for the web. It’s meant to write web servers and handle HTTP intelligently.

Node also follows many UNIX principles. Doug McIlroy succinctly summarized the UNIX philosophy as “Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.” NPM does a great job letting authors write small modules which work well together. This has been tough previously in JavaScript because web browsers have no “require” function; there’s no native way for modules to define and load their dependencies, which resulted in the popularity of large, complicated libraries.3 jQuery is a good example; it’s tremendously popular and it includes hundreds of functions in its API, while most sites that use it really only need a few. Large, complicated programs are more difficult to test, debug, and reason about, which is why UNIX avoided them.

Many Node modules also support streams that allow you to pipe data through a series of programs. This is analogous to how BASH and other shells let you pipe text from one command to another, with each command taking the output of the last as its input. To visualize this, see Stream Playground written by John Resig, creator of jQuery. Streams allow you to plug-in different functionality in when needed. This pseudocode shows how one might read a CSV from a server’s file system (the core “fs” library stands for “file system”), filter out certain rows, and send it over HTTP:

fs.createReadStream('spreadsheet.csv').pipe(filter).pipe(http);
// Want to compress the response? Just add another pipe.
fs.createReadStream('spreadsheet.csv').pipe(filter).pipe(compressor).pipe(http);

Streams have the advantage of limiting how much memory a program uses because only small portions of data are being operated on at once. Think of the difference between copying a million-line spreadsheet all at once or line-by-line; the second is less likely to crash or run into the limit of how much data the system clipboard can hold.

Libraryland Examples

Node is still very new and there aren’t a lot prominent examples of library usage. I’ll try to present a few, but I think it’s more worth knowing about as a major trend in web development.

Most amusingly, Ed Summers of the Library of Congress and Sean Hannan of Johns Hopkins University made a Cataloging Highscores page that presents original cataloging performed in WorldCat in a retro arcade-style display. This app uses the popular socket.io module that establishes a real-time connection between your browser and the server, a strength of Node. Any web service that needs to be continually updated is a prime candidate for Node.js: current news articles, social media streams, auto-complete suggestions as a user types in search terms, and chat reference all come to mind. In fact, SpringShare’s LibChat uses socket.io as well, though I can’t tell if it’s using Node on the server or PHP. A similar example of real-time updating, also by Ed Summers, is Wikistream which streams the dizzying number of edits happening on various Wikipedias through your browser.4

There was a lightning talk on Node at Code4Lib 2010 which mentions writing a connector to the popular Apache Solr search platform. Aaron Coburn’s proposed talk for Code4Lib 2014 mentions that Amherst is using Node to build the web front-end to their Fedora-based digital library.

Tools You Can Use

With the explosive growth of NPM, there are already tons of useful tools written in Node. While many of these are tools for writing web servers, like Express, some are command line programs you can use to accomplish a variety of tasks.

Yeoman is a scaffolding application that makes it easy to produce various web apps by giving you expert templates. You can install separate generators that produce templates for things like a Twitter Bootstrap site, a JavaScript bookmarklet, a mobile site, or a project using the Angular JavaScript MVC framework. Running yo angular to invoke the Angular generator gives you a lot more than just a base HTML file and some JavaScript libraries; it also provides a series of Grunt tasks for testing, running a development server, and building a site optimized for production. Grunt is another incredibly useful Node project, dubbed “the JavaScript task runner.” It lets you pick from hundreds of community plugins to automate tedious tasks like minifying and concatenating your scripts before deploying a website.

Finally, another tool that I like is phantomas which is a Node project that works with PhantomJS to run a suite of performance tests on a site. It provides more detailed reports than any other performance tool I’ve used, telling you things like how many DOM queries ran and median latency of HTTP requests.

Learn More

Nodeschool.io features a growing number of lessons on using Node. Better yet, the lessons are actually written in Node, so you install them with NPM and verify your results on the command line. There are several topics, from basics to using streams to working with databases.

Nettuts+, always a good place for coding tutorials, has an introduction to Node which takes you from installation to coding a real-time server. If you want to learn about writing a real-time chat application with socket.io, they have a tutorial for that, too.

If you want a broad and thorough overview, there are a few introductory books on Node, with The Node Beginner Book offering several free chapters. O’Reilly’s Node for Front-End Developers is also a good starting point.

How to Node is a popular blog with articles on various topics, though some are too in-depth for beginners. I’d head here if you want to learn more on a specific topic, such as streams, or working with particular databases like MongoDB.

Finally, the Node API docs are a good place to go when you get stuck using a particular core module.

Notes

  1. If you use a package manager, such as Homebrew on Mac OS X or APT on Linux, Node is likely available within it. One caveat I have noticed is that the stock Debian/Ubuntu apt-get install nodejs is a few major versions behind; you may want to add Chris Lea’s PPA to get a current version. If you’re subject to the whims of your IT department, you may need to convince them to install Node for you, or talk to your sysadmin to get it on your server. Since it’s a rather new technology, don’t be surprised if you have to explain what it is and why you want to try it out.
  2. Previous projects, including Rhino from Mozilla and Narwhal, have let people use JavaScript outside the server. Node, however, has caught on far more than either of these projects, for some of the reasons outlined in this post.
  3. RequireJS is one project that’s trying to address this need. The ECMAScript standard that defines JavaScript is also working on native modules but they’re in draft form and it’ll be a long time before all browsers support them.
  4. If you’re curious, the code for both Cataloging Highscores and Wikistream are open source and available on GitHub.

One Comment on “What is Node.js & why do I care?”

  1. There’s that question that comes-up every so often in the bibliosphere about whether librarians should learn to code and, if so, what. Node should be on that short list. If you aspire to develop libstuff for the #libweb, Ruby and PHP are interchangeable – but Node is going to be everywhere, because even if you’re not using it to supplant another server-side language or you’re purely working on the front-end, you’ll surely run into Grunt.

    Oh, and people are already using node to program robots – nodebots! That’s pretty cool.