Thursday, August 22, 2013

long live the callbacks

Update: Either before or after you read this, also read my follow up.

Been thinking about what to write that won't be made useless within the next month due to upcoming API changes. So today, instead of giving you something useful I'm going to contribute to the so called "callback hell" flame war.

Honestly, I've never understood why people hate callbacks so much. Oh wait, I know. It's because they're Doing it Wrong. Then once they're 6+ indentations deep the realization comes that the code is hard to read. So they blame the callback! Poor little callback. People spit on your very existence because they don't get to know you, and instead marry themselves to the idea of chainability. I can understand that. Did web development with jQuery for years. I prided myself on bending those chains to my will. Then one day I asked, what was I gaining?

So let's get into the first issue I have: declaring functions within closures. Don't do it. Don't even think about it (ok, there are a few cases it's necessary, but apply sparingly). It makes code difficult to read, and if you give even the tiniest pigeon's poop about performance you'll heed this advice. Just to be sure you understand, let's show an exmaple:

function Points(x0, y0, x1, y1) {
  this.distance = function distance() {
    var x = x1 - x0;
    var y = y1 - y0;
    return Math.sqrt(x * x + y * y);

var iter = 1e6;
var rand = Math.random;
for (var i = 0; i < iter; i++) {
  var p = new Points(rand(), rand(), rand(), rand());
  for (var j = 0; j < 1e3; j++)

This doesn't look so bad, right? WRONG! You've saved yourself needing to assign some variables, but at what cost? Well...

$ /usr/bin/time node points.js 
33.47user 0.09system 0:33.65elapsed 99%CPU (0avgtext+0avgdata 49584maxresident)k
0inputs+0outputs (0major+34036minor)pagefaults 0swaps

Ok, so it took 33 seconds and used right around 50MB memory. Let's make a slight adjustment to the implementation:

function Points(x0, y0, x1, y1) {
  this._points = {
    x0: x0,
    y0: y0,
    x1: x1,
    y1: y1

Points.prototype.distance = function distance() {
  var p = this._points;
  var x = p.x1 - p.x0;
  var y = p.y1 - p.y0;
  return Math.sqrt(x * x + y * y);

And how did it do?

$ /usr/bin/time node points.js 
1.21user 0.01system 0:01.23elapsed 99%CPU (0avgtext+0avgdata 14224maxresident)k
0inputs+0outputs (0major+3902minor)pagefaults 0swaps

Um. Well... Ok. Actually had to double check my code because I wasn't expecting the difference to be this dramatic. Now it runs in under 2 seconds and only uses 15MB memory. We're not going to get into an in depth look at why this is happening. That's for another day, but let me reiterate the point. DON'T USE FUNCTION CLOSURES DON'T DECLARE FUNCTIONS WITHIN CLOSURES! (in performance critical paths)

What does that have to do with callbacks? Simple, don't nest your callback functions. There have been a lot of articles written in the last couple months about the awesomeness of generators and how horrible callbacks are. Though if you take a look at the benchmarks you'll notice the callback are usually fairly nested.

Let's take a look at a very contrived example just to get the point across:

var SB = require('buffer').SlowBuffer;

function runner(cb, arg) {
  process.nextTick(function() {

var iter = 2e4;

for (var i = 0; i < iter; i++) {
  runner(function genPrimes(max) {
    var primes = [];
    var len = ((max / 8) >>> 0) + 1;
    var sieve = new SB(len);
    sieve.fill(0xff, 0, len);
    var cntr, x, j;
    for (cntr = 0, x = 2; x <= max; x++) {
      if (sieve[(x / 8) >>> 0] & (1 << (x % 8))) {
        primes[cntr++] = x;
        for (j = 2 * x; j <= max; j += x) {
          sieve[(j / 8) >>> 0] &= ~(1 << (j % 8));
    return primes;
  }, i);

Side note: I challenge anyone to come up with a faster prime generator in JavaScript.

This style of passing the callback directly into the function is used all over the place, and while it looks innocent enough it can be the death of any hopeful performance.

$ /usr/bin/time node genprimes.js 
18.84user 0.02system 0:18.91elapsed 99%CPU (0avgtext+0avgdata 34132maxresident)k
0inputs+0outputs (0major+8896minor)pagefaults 0swaps

18 seconds. Not bad I guess. All we did was declare genPrimes in the location it's being passed, but let's make the minor adjustment of moving it just below runner() and let's see what we get:

$ /usr/bin/time node genprimes.js 
2.48user 0.01system 0:02.50elapsed 99%CPU (0avgtext+0avgdata 30352maxresident)k
0inputs+0outputs (0major+7958minor)pagefaults 0swaps

Awesome. Execution time down to 3 seconds, and all we had to do was flatten our code a little. So this solves two problems. First, we've gained a massive amount of performance. Second, we're not a dozen indentations deep with our callback structure.

As far as performance goes, I think the argument is empirically pretty simple. Any of these other overly complicated ways of doing asynchronous callback structures have to use at least this at the core of it's execution model. Also there's any overhead of the library itself. So there's no possible way any other method of managing your callbacks could be faster, and as we've demonstrated that difference may not be trivial.

In conclusion, suck it up and use callbacks. They're easy to understand and maintain, and you won't be left wondering how much extra is your fancy-schmancy way of doing things costing you. Also, if anyone reading this article plans on doing additional performance analysis on basic callbacks vs whatever else, make sure it's done correctly. Because I will find them, and then I'll publicly mock them.

UPDATE: There seem to be two things people are bringing up:

First, do I understand what I'm measuring? Yes. The point of the benchmarks is to show the difference between what's common practice (creating functions within closures to access variables and declaring the function where it's being passed) and doing it the Right Way. It has everything to do with how the function is declared. That's the point. My assumption was that people would only think about declaring a function within a closure to access the variables within it. I'll have a followup post explaining why these two things affect performance so severely.

Second, is this post really about callbacks? Yes. I realize it may be hard to see, but the point was callbacks Done Right are easy to understand, just as easy to read (i.e. don't experience indentation hell) as other implementations and they're always faster.

Considering I finished this at 4am, and have only gotten 3 hours of sleep before writing this update, I might find this entire post absurd tomorrow. But it's unlikely.

UPDATE2: I fogot that Buffer#fill() only returns the instance on master, so I update the example to work appropriately on previous versions of Node.

No comments:

Post a Comment