Node.js – How to Write a For Loop With Callbacks

Let’s say you have 10 files that you need to upload to your web server. 10 very large files. You need to write an upload script because it needs to be an automated process that happens every day.

You’ve decided you’re going to use Node.js for the script because, hey, it’s cool.

Let’s also say you have a magical upload function that can do the upload:

upload('myfile.ext', function(err){
  if( err ) {
    console.log('yeah, that upload didn't work: '+err)
  }
})

This upload function uses the callback pattern you’d expect from Node. You give it the name of the file you want to upload and then it goes off and does its thing. After while, when it is finished, it calls you back, by calling your callback function. It passes in one argument, an err object. The Node convention is that the first parameter to a callback is an object that describes any errors that happened. If this object is null, then everything was OK. If not, then the object contains a description of the error. This could be a string, or a more complex object.

I’ll write a post on that innards of that upload function – coming soon!

Right, now that you have your magical upload function, let’s get back to writing a for loop.

Are you a refugee from Javaland? Here’s the way you were thinking of doing it:

var filenames = [...]

try {
  for( var i = 0; i < filenames.length; i++ ) {
    upload( filenames[i], function(err) {
      if( err ) throw err
    })
  }
}
catch( err ) {
  console.log('error: '+err)
}

Here's what you think will happen:
1. upload each file in turn, one after the other
2. if there's an error, halt the entire process, and throw it to the calling code

Here's what you just did:
1. Started shoving all 10 files at your web server all at once
2. If there is an error, good luck catching it outside that for loop – it's gone to the great Event Loop in the sky

Node is asynchronous. The upload function will return before it even starts the upload. It will return back to your for loop. And your for loop will move on to the next file. And the next one.

Is your website a little unresponsive? How about your net connection? Things might be a little slow when you push all those files up at the same time.

So you can't use for loops any more! What's a coder to do? Bite the bullet and recurse. It's the only way to get back to what you actually want to do.

You have to wait for the callback. When it is called, only then do you move on to the next file. That means you need to call another function inside your callback. And this function needs to start uploading the next file. So you need to create a recursive function that does this.

It turns out there's a nice little recursive pattern that you can use for this particular case:

var filenames = [...]

function uploader(i) {
  if( i < filenames.length ) {
    upload( filenames[i], function(err) {
      if( err ) {
        console.log('error: '+err)
      }
      else {
        uploader(i+1)
      }
    })
  }
}
uploader(0)

Do you see the pattern?

repeater(i) {
  if( i < length ) {
     asyncwork( function(){
       repeater( i + 1 )
     })
  }
}
repeater(0)

You can translate this back into a traditional for(var i = 0; i < length; i++) loop quite easily:

repeater(0) is var i = 0,
if( i < length ) is i < length, and
repeater( i + 1 ) is i++

When it comes to Node, the traditional way of doing things can mean you lose control of your code. Use recursion to get control back.

Posted in Node.js | View Comments

Node.js – Dealing with submitted HTTP request data when you have to make a database call first

Node’s asynchronous events are fantastic, but they can have a sting in the tail. Here’s a solution to something that you’ll probably run into at some point.

If you have a HTTP endpoint that accepts JSON, XML, or even a streaming upload, you normally read the data in using the data and end events on the request object:

var bodyarr = []
request.on('data', function(chunk){
  bodyarr.push(chunk);
})
request.on('end', function(){
  console.log( bodyarr.join('') )
})

This works in most situations. But when you start building out your app, adding in production features like user authentication, then you run in trouble.

Let’s say you’re using connect, and you write a little middleware function to do user authentication. Don’t worry if you are not familiar with connect – it’s not essential to this example. Your authentication middleware function gets called before your data handler, to make sure that the user is allowed to make the request and send you data. If the user is logged in, all is well, and your data handler gets called. If the user is not logged in, you send back a 401 Unauthorized.

Here’s the catch: your authentication function needs to talk to the database to get the user’s details. Or load them from memcache. Or from some other external system. (Don’t tell me you’re still using sessions in this day and age!)

So here’s what happens. Node will happily start accepting inbound data on the HTTP request, but before you’ve had a chance to bind your handler functions to the data and end events. Your even set up code only gets called after the authentication middleware is finished its thing. This is just the way that Node’s asynchronous event loop works. In this scenario, by the time Node gets to your data handler, the data is long gone, and you’ll stall waiting for events that never come. If your response handler depends on that end event, it will never get called, and Node will never send a HTTP response. Bad.

Here’s the rule of thumb: you need to attach your handlers to the HTTP request events before you make any asynchronous calls. Then you cache the data until you’re ready to deal with it.

Luckily for you, I’ve written a little StreamBuffer object to do the dirty work. Here’s how you use it. In that authentication function, or maybe before it, attach the request events:

new StreamBuffer(request)

This adds a special streambuffer property to the request object. Once you reach your handler set up code, just attach your handlers like this:

request.streambuffer.ondata(function(chunk) {
  // your funky stuff with data
})
req.streambuffer.onend(function() {
  // all done!
})

In the meantime, you can make as many asynchronous calls as you like, and your data will be waiting for you when you get to it.

Here’s the code for the StreamBuffer itself. (Also as a Node.js StreamBuffer github gist).

function StreamBuffer(req) {
  var self = this

  var buffer = []
  var ended  = false
  var ondata = null
  var onend  = null

  self.ondata = function(f) {
    for(var i = 0; i < buffer.length; i++ ) {
      f(buffer[i])
    }
    ondata = f
  }

  self.onend = function(f) {
    onend = f
    if( ended ) {
      onend()
    }
  }

  req.on('data', function(chunk) {
    if( ondata ) {
      ondata(chunk)
    }
    else {
      buffer.push(chunk)
    }
  })

  req.on('end', function() {
    ended = true
    if( onend ) {
      onend()
    }
  })        

  req.streambuffer = self
}

This originally came up when I was trying to solve the problem discussed in this question in the Node mailing list.

Posted in Node.js | View Comments

Debug PhoneGap Mobile Apps Five Times Faster

PhoneGap is a fantastic open source project. It lets you build native mobile apps for iPhone, Android and others using only HTML, CSS and JavaScript. It’s a real pleasure to work with. It makes developing mobile apps a lot faster.

Still, you might find that your debug cycle is still too slow. After all, you still have to deploy your app to your phone for proper testing, and this can chew up precious time. The faster you can wash, rinse and repeat, the faster you can debug, and the faster you can deliver.

One way to speed things up is to use Safari on your desktop. There’s an even faster technique, but we’ll get to that in a minute. Using a WebKit-based desktop browser like Safari means that your development cycle is almost as fast as building a static website. Edit, Save, Reload. Just point Safari at the www/index.html file in your PhoneGap project and away you go.

Well almost.

Desktop browsers don’t offer exactly the same API, nor do they work in exactly the same way. Some mobile functions, like beeping or vibrating the phone are not really testable. The biggest issue though is that desktop browsers are too fast. Don’t forget that your runtime target is a mobile version of WebKit, such as Mobile Safari. Another issue is that touch gestures are tricky to handle, and have to be simulated with click events. It is worth it though for the fast development turnaround for certain kinds of functionality.

The obvious next step is to compile up your app in XCode and deploy to the simulator. Again, this works pretty well, but even the simulator has differences from the actual device, and again, it is just too fast. So what else can you do?

Why not install your native app as a web app? Sounds weird I know. The whole point of using PhoneGap is so that your apps can be native! But, if you install your app as a web app, guess what? No more installs! You just reload the app directly on your device every time you make a change.

Setting this up requires a little configuration. You need to run a web server to serve up the files in the www folder of the PhoneGap project. nginx is a good choice – here’s a simple configuration snippet:

You can then point your browser at http://your-ip/myapp/index.html and there’s your app! Do this using mobile Safari on your device, hit the + button and select “Add to Home Screen” to install as a web app, and away you go.

The big advantage to this approach is that you can test your app pretty much as it will appear and behave. You can even access the mobile safari debug log. Just remember to use the special meta tags to get rid of the browser chrome.

One further advantage is that the API environment will now be slightly closer to the full PhoneGap mobile API. Of course, you won’t be able to do things that can only be done using PhoneGap, but this gets you quite far along the road.

One final trick. Do the same thing on the desktop iPhone emulator and speed up your testing there as well!

Posted in Uncategorized | View Comments

Do Something Practical With CSV Files!

Want to be able to export and import tables from your database using a web interface? You've come to the right place!

I've just finished a new tutorial for our CSV Manager product: Uploading and Downloading CSV Files from a Website Database. It's one of those classic CSV use cases — a simple solution to a tricky problem.

Basically, you can outsource comnplex data editing tasks to Excel. This means you don't have to write such a complex back office application for your customers. And everybody's happy!

Posted in Java | View Comments

Spark Lines Without the Spark

Sparklines are one of those great ideas that you just know is “right” the moment you see it. Edward Tufte invented them, and let me tell you, he knows his stuff.

Here's an example: . Want to make some yourself? Check out Joe Gregorio's Sparkline creator.

So what's this rant about? Well given that sparklines are such a great little idea, such a compact, non-intrusive way to present information, you'd imagine it would be hard to get them wrong. And that's exactly what Der Spiegel has managed to do.

Take a look at this article about the current market meltdown. Look at all those lovely sparklines! Each one right beside the market index refered to. Lovely.

Oh wait. They're all the bloody same! Huh? Why go to the bother of inserting a little graphic beside each market index, in the text, and not making it a sparkline? Imagine how much more readable and understandable the text would be if these little graphics were real sparklines! Way to go. What a waste. If I was the online editor of Der Spiegel I would really jump on this and sort it out. What a difference it would make.

Posted in Rant | View Comments

Some Volatile Patterns

I've always regarded Java's volatile variables as voodoo variables. In fact, I've been scared off by very many articles telling you how terribly dangerous they are. In cases like these I tend to retreat to the safety of a few good patterns.

Except, I could never find any good patterns for using volatile. Luckily, Brian Goetz has just written an article solving this problem! Go check out Managing volatility.

The patterns are:

And hey, it's Brian Mr. Concurrency Goetz, so this stuff has to be good!

tag gen:Technorati Tags: Del.icio.us Tags:

Posted in Java | View Comments

How to Beat Nasty Interview Programming Tasks

Shane Bell does a write-up of an interview he went through. Apparently the company just dumped a programming exercise on him and left him with a pencil and paper for an hour. Nasty!

While the basic idea of a “real” programming test at interview is great, asking someone to do it with a pencil is just plain daft! This is a perfect example of cargo-culting. They know they should get people to program in an interview, they know they should ask a “tough” question. But then they invalidate the whole thing by testing “pencil-based-programming-acuity”! Whatcha building guys? A Babbage engine? Um, you know, how difficult is it, if you are going to the trouble of all this testing, to set up a locked down machine with no internet access?

Anyway, Shane runs through the exercise and his solution. He does pretty well. He also asks if there's a better solution.

Yes, Virginia, there is a Santa Claus!

And he lives at MIT OpenCourseWare. Specifically, the AI search lectures. Fantastic stuff.

Looking at the problem they gave Shane, finding a path through maze from top-right to bottom-left, it looks like you could throw an A* search at it and do pretty well. Add some iterative-deepening if you're feeling fancy and want to handle big mazes. Basically, you try to predict the best direction by calculating your current straight-line distance from the goal square at the bottom right, and choosing the next square as the one that gets you closest. If you get stuck in a cul-de-sac, backtrack out of it (Shane does use backtracking).

So how do you beat these nasty interviews? Know your search algorithms! Most of these “puzzles” can be solved with some sort of search. I'll bet you anything the guys who set this question where either a.) clueless, so a good algorithm will really impress them, or b.) not clueless and actually looking for a proper algorithm like A*. Either way you win!

Posted in General | View Comments

Level 3!

Well you might have thought that I had given up on the touch typing. I've been trying to learn to touch-type for the last two years. It's all going tragically slowly. But, I can tell you that I am in fact touch-typing this very blog post — my first touch-typed blog post ever!

I'm not really there yet — my touch-typing is still slower than my “natural” typing. But I have, finally, cracked the notorious level three on the learn2type.com site. If you've read my previous post about learn2type, you'll remember that level three is this dreadfully unbalanced drill that goes straight into all the punctuation straight from the home row keys. It's a real killer for your enthusiasm. You have to be pretty dedicated to beat it. It took me over a year. While the learn2type site is pretty OK so far as learning to type goes, it does have some serious flaws. And I have not seen any updates in over a year. Still, I would recommend it overall &mdash the performance graphs in particular are very cool, providing good feedback on your progress.

My meta-strategy for learning to touch-type remains the same as before — try all the online tutoring sites one at a time until I have good speed and accuracy. Stay tuned…

Posted in General | View Comments

Boxes and Lines, Boxes and Lines…

Charles Miller posted a great comment on his blog that absolutely cracks me up:

Pretty much any computing problem, given a sufficient level of abstraction, can be reduced to a diagram of boxes joined together with lines. At this level your solution will look startlingly simple, and you'll be able to sell it to someone.

So true, so true.

Posted in General | View Comments

How to Create a Comment Archive Using CSV to Generate HTML

I've been trying to find a workable way to manage my comments for quite some time. By which I mean, the comments that I make on other people's blogs. You need to be able to go back and see if the conversation has progressed. It's also nice just to have a record of what you said and when you said it.

I was using CoComment for a while. This is a service that tracks comments on blogs. It's pretty cool. Trouble is, it only works for the main blogging engines, and you have to install a plugin. I removed all plugins from my Firefox recently because it was acting up, and I'm not keen on reinstalling just at the moment. In any case, the CoComment plugin tended to slow down non-blog sites (looking for comment forms I suppose).

So I've decided on a simpler solution: just have a page on my blog where all my comments are listed in reverse chronological order, with a link back to the relevant blog entry. I can skim through the first few to see if recent conversations have anything new. As for the old conversations, well, I guess I won't know if there are more comments. But that's “good enough” for the time being. The easiest way to build this page is cut-and-paste. Come up with a bit of HTML and copy it for each new entry. Yeah, it has to be done by hand, but hey! The archive of comments is interesting enough to be worth recording.

Here's the comments archive, so you can see what I mean.

Well, you're right, cut-and-paste is such a bad smell. It's better to have your data in a manageable format. So Ricebridge to the rescue! You can put the data into a CSV file and generate the HTML (or rather XHTML) from it. For example, here's a record of some comments:

Date,Blog,Link,Comment
2007-04-10,mariosalexandrou.com, \
  http://www.mariosalexandrou.com/blog/?p=291&c=y, \
  "Hey! I did all that already! Where's my six figures? love it :)  "
2007-04-04,Tyner Blain, \
  http://tynerblain.com/blog/2007/04/03/ba-profit-center, \
  "It’s amazing how naming something almost completely defines it."
...

It's just a CSV file. Easy to update by hand. Whenever you make a new comment, throw in the details (date, blog title, link and comment text) at the top of the CSV file.

So then how do we turn this into HTML? Well, here's the HTML I'm producing from this CSV file:

<div class="commentbox">
  <div class="comment">
    <b>
      <span>2007-04-10</span>
      <a href="http://www.mariosalexandrou.com/blog/?p=291&c=y">mariosalexandrou.com</a>
    </b>
    <p>Hey! I did all that already! Where's my six figures? it :)  </p>
  </div>
  <div class="comment">
    <b>
      <span>2007-04-04</span>
      <a href="http://tynerblain.com/blog/2007/04/03/ba-profit-center">Tyner Blain</a>
    </b>
    <p>It's amazing how naming something almost completely defines it.</p>
  </div>
</div>

It's a nice little microformat of sorts, I suppose.

To produce this, you need to take the CSV columns and place them into the right positions in the XML format. We're generating XHTML, which is just XML, which is just well-behaved HTML, so this is all cool and froody.

Using XML Manager, you can define a set of XPath expressions to handle this. And here they are:

each row     -> /div/div
'commentbox' -> /div/@class
'comment'    -> @class
Date         -> b/span
Blog         -> b/a
Link         -> b/a/@href
Comment      -> p

This creates a main <div class="commentbox"> containing a set of <div class="comment"> elements, one for each comment. The CSV columns all go into subelements of the comment div.

And here's the code to tie it all together:

CsvManager csvman = new CsvManager();
csvman.getCsvSpec().setStartLine(2);
csvman.getCsvSpec().setIgnoreEmptyLines(true);
List in = csvman.load("data/comment.csv");

RecordSpec rs = new RecordSpec("/html/body/div/div",
    new String[] { "/html/body/div/@class", "@class",
      "b/span","b/a","b/a/@href","p"});

List out = new ArrayList();
for( Iterator cI = in.iterator(); cI.hasNext(); ) {
  String[] inrow = (String[]) cI.next();
  String[] outrow = new String[] {"commentbox","comment",
    inrow[0],inrow[1],inrow[2],inrow[3]};
  out.add(outrow);
}

XmlManager xmlman = new XmlManager(rs);
xmlman.save("data/comment.htm",out);

You just load up the CSV, and spit it out again as XML… “there's nothing to it, really…”

And then all you do is dynamically include this file on your web page, and you're done!

tag gen:Technorati Tags: Del.icio.us Tags:

Posted in Java | View Comments