thanpolas // web development as it happens

The Promises/A+ specification is a fresh and very interesting way of dealing with the asynchronous nature of Javascript. It also provides a sensible way to deal with error handling and exceptions. In this article we will go through the performance hits you should be aware of and as a side-effect do a comparison between the two most popular Promises/A+ implementations, When and Q and how they compare to Async, the lowest abstraction you can get on asynchronicity.

If you are in a hurry, just skip to the Conclusions.

The Case

My motivation for looking deeper into the performance of Promises/A+ was a Job Queuing system i've been working on named Kickq. It is expected that the system will get hammered when used on production so stress testing was warranted. After stubbing all the database interactions, essentially making the operation of job creation synchronous, I was getting odd performance results.

The test was simple, create 500 jobs in a loop and measure how long it takes for all the jobs to finish.

The measurements were in the ~550ms range and my eyeballs started to roll. "That's a synchronous operation, it should finish in less than 3ms, WHAT THE????!?!". After taking a few moments to let it sip in the suspect was found, it was Promises. I used them as the only pattern to handle asynchronous ops and callbacks throughout the whole project. Brian Cavalier, one of the authors of When.js, helped me pinpoint the real culprit, it was the tick:

Promises/A+ Specification, Note 4.1 In practical terms, an implementation must use a mechanism such as setTimeout, setImmediate, or process.nextTick to ensure that onFulfilled and onRejected are not invoked in the same turn of the event loop as the call to then to which they are passed.

In other words, Promises, per the specification, must be resolved Asynchronously! That comes with a cost, a heavy one apparently.

In the process of studying performance I had to create a performance library, poor mans profiling. And a benchmark test for Promises/A+ implementations that's already used to optimize the future versions of When.

Creating The Promises/A+ Benchmark

I tried to broaden the definition of the test case. If an application uses the Promises pattern as the only way to manage how the internal parts interact, we can make a few assumptions:

The number of how many promises will be chained was arbitrarily chosen to be 7, it represents a mean call-stack of an average operation. Any app logic is stripped, we only measure how long it takes to resolve a chain of 7 promises:

// "Prom" is your promises implementation.
// Must support the .defer() method.
app.promise = function(Prom) {
  var def = Prom.defer();
  var def2 = Prom.defer();
  // ... def3, def4...
  var def7 = Prom.defer();

  var getDef3 = function() { return def3.promise; };
  // ... chain chain ...
  var getDef7 = function() { return def7.promise; };

  // chain them
  def2.promise
    .then(getDef3)
    // ... chain chain ...
    .then(getDef7)
    .then(def.resolve);

  // in the first set of benchmarks, promises are resolved synchronously...
  def2.resolve();
  // def3.resolve ....
  def7.resolve();

  // in the second set of benchmarks, one of the promises is resolved async like so:
  setTimeout(def2.resolve);

  return def.promise;
};

Find a snapshot of the actual app in this commit.

Update 11/Jun/2013

The post has been updated to include one more test and rename what used to be the Async test to Mixed. Based on an issue raised by Domenic, a few points were made that need to be addressed and implemented in the test. Thus the article update...

Promises are built to handle Asynchronous functions

Yes indeed. Libraries or ideas are merely used as their original creator intented. There are two major reasons why you may use promises synchronously.

  1. In real world cases you have conditionals, in a Promise returning function a condition may force the function to finish synchronously, thus returning a Resolved Promise.
  2. The second case has to do with design choices, maintainable and scalable code. Through time and again, one of the biggest pain points with large codebases, in regard to control flow and asynchronicity is the fact that methods and functions evolve and get refactored.

Over time, what used to be a synchronous function becomes async and vice versa. Each change, means a hell of a lot has to be refactored to match that evolution.

I found Promises to solve that design problem by blanketing everything with a returned promise and not caring about how the function will evolve over time in respect to sync / async. This practice also enables better code compositionality and better abstractions.

Therefore, it is totally expected, for a number of Promises to resolve synchronously, for any of the above reasons.

The Perf Test Resolves Synchronous Functions

In effect Async resolves noop funcs.

Yes. A request was made for all the stub functions to be asynchronous. That request has been implemented and you can view the results bellow, named as Async.

The Added nextTick in Promises/A+ should not be always added

Domenic painstakingly explained to me the subtle details of the Promises/A+ spec that requires asynchronous resolution. Consider this simple case:

function asyncOp(){
  var def = when.defer();

  // emulate an async operation (e.g. file read)
  setTimeout(function(){
    def.resolve();
    console.log('X');
  });

  return def.promise;
}

asyncOp().then(function(){
  console.log('Y');
});

console.log('Z');

You'd typically expect that you'll see printed Z, X, Y. But that is not the case as per the spec. The spec only requires that resolution "is not invoked in the same turn of the event loop as the call to then to which they are passed.".

If it is another turn the Promise implementation can resolve the promise synchronously thus printing Z, Y, X. A very subtle detail, and one that according to Domenic not all implementations have really implemented.

Time Benchmark

The app.promise() function gets invoked asynchronously in a loop of n times. We mark the time down to microseconds, using the node-microtime library. The time is marked in the following events:

  1. Before the loop starts.
  2. Right before app.promise() is invoked, (gets marked n times).
  3. Right after app.promise() is resolved, (gets marked n times).
  4. After all created promises resolve.

The total execution time and the difference between these marks is what is measured, see bellow a test run using just 5 loops:

No.   JS Timestamp     Diff       Message
0.    1366657964509117 [NaN ms]   Start
1.    1366657964509342 [0.225 ms] after for
2.    1366657964509607 [0.265 ms] Creating promise:0
3.    1366657964509704 [0.097 ms] Creating promise:1
4.    1366657964509752 [0.048 ms] Creating promise:2
5.    1366657964509766 [0.014 ms] Creating promise:3
6.    1366657964509792 [0.026 ms] Creating promise:4
7.    1366657964510264 [0.472 ms] Promise resolved:0
8.    1366657964510274 [0.010 ms] Promise resolved:1
9.    1366657964510279 [0.005 ms] Promise resolved:2
10.   1366657964510285 [0.006 ms] Promise resolved:3
11.   1366657964510288 [0.003 ms] Promise resolved:4
12.   1366657964510537 [0.249 ms] Finish

Notice the significant delay in Mark No 7, it takes a while from the last app.promise() invocation (No 6) till we hear back from the first fulfilled promise.

The difference of Mark No 7 is the main metric that is used to benchmark promises, it actually is the First Resolved Promise.

Memory Benchmark

Memory consumption is a bit dodgy to measure during runtime. To take memory consumption snapshots the process.memoryUsage() method is invoked and the property heapUsed is logged.

process.memoryUsage();

// outputs:
{
  rss: 14671872,
  heapTotal: 6131200,
  heapUsed: 3370296
}

A heapUsed measurement is taken when the benchmark starts running. From there on every time all promises are resolved another measurement is taken. Comparing the difference between the two is what we benchmark and compare between the packages.

var firstSnapshot = process.memoryUsage().heapUsed;
// firstSnapshot == 3370296

/* ... app.promise() runs and finishes */

var finishSnapshot = process.memoryUsage().heapUsed;
// finishSnapshot == 6645932

var diff = finishSnapshot - firstSnapshot;
// diff == 3275636 or +97% from the first measurement

Now beware, we are not taking these measurements in face value. What we will only be observing is, given the same test, what the differences are between the various implementations.

The Results

The tests were run for 10, 100, 500 and 1,000 loops. Each set of loops was run 20 times to normalize the results and the means were taken from these 20 runs. The libraries used for measuring are:

As more libraries are added this article will get updated with how they performed.

There are three sets of tests done:

  1. All stub functions resolve synchronously.
  2. Half the stub functions resolve synchronously and the other hald asynchronously
  3. All stub functions resolve asynchronously.

Sets 1 and 3 can be considered edge cases. The most close to reality test case is 2, Mixed.

Difference to First Resolved Promise, 500 Loops

Perf Type Async When 2.1.0 Q 0.9.5 Promise 3.0.1
Sync Diff 0.01ms 36.62ms 186.43ms 63.96ms
Mixed Diff 5.37ms 41.78ms 226.34ms 83.83ms
Async Diff 22.42ms 58.18ms 241.80ms 93.68ms
Sync Diff vs AsyncLib 1x 3,662x 18,643x 6,396x
Mixed Diff vs AsyncLib 1x 7.78x 42.15x 15.61x
Async Diff vs AsyncLib 1x 2.60x 10.79x 4.18x

Libraries When.js v1.8.1 and Deferred are not included in this table because they resolve promises synchronously. This difference makes the Diff metric inapplicable.

Total Time of execution, 500 Loops

Perf Type Async When 1.8.1 When 2.1.0 Q 0.9.5 Deferred 0.6.3 Promise 3.0.1
Sync Total 5.15ms 12.35ms 72.35ms 301.47ms 71.25ms 80.50ms
Mixed Total 18.94ms 40.57ms 80.21ms 325.49ms 94.58ms 95.67ms
Async Total 35.70ms 50.63ms 90.52ms 337.82ms 105.87ms 107.01ms
Sync Total vs AsyncLib 1x 2.40x 14.05x 58.54x 13.83x 15.63x
Mixed Total vs AsyncLib 1x 2.14x 4.23x 17.19x 4.99x 5.05x
Async Total vs AsyncLib 1x 1.42x 2.54x 9.46x 2.97x 3.00x

Average Memory Difference - Single 500 Loop Runs

Pert Type Async When 1.8.1 When 2.0.1 When 2.1.x Q Q longStack=0 Deferred
Sync 113.29% 160.98% 840.21% 866.88% 1106.67% 684.56% 354.07%
Async 159.29% 458.44% 811.32% 834.63% 1110.21% 691.41% 429.18%

Charts

Total Time to Resolve, 500 Loops

Promises, Total Time to Resolve, 500 Loops

Memory Consumption

Promises, Memory Consumption

Checkout the updated results in this Google Spreadsheet, you can find the first version results in this spreadsheet.

Comments On The Findings

Synchronous resolution of all 7 chained promises is most likely an unnatural case, much like Asynchronous on all 7 cases is. Using Mixed asynchronous resolution in the stub funcs seems to improve the total time of execution both in Q and When.js, Async will suffer a minor penalty.

Update: @domenic rightly pointed out a faulty tweak i attempted to do with Q. I did not properly enable the option to zero the long stack traces. With this tweak enabled Q will perform up to 9x times faster! The Charts and Tables have been updated.

The unsung winner here is When v1.8.1, as already mentioned, v1.8.1 contrary to the spec, will resolve the promises synchronously. The next version of When.js, that resolves promises Asynchronously, v2.0.1 performs ~5x slower.

Memory consumption is shaky at best. The tests were perform on single node runs to avoid the previous runs contaminating the measurements. The node option --expose-gc was used and global.gc() was invoked in each of the 20 master loops. Still, this does not guarantee that the garbage collector will actually kick in. What we can merely observe is, given the same method of measurement, how each package performed individually.

Conclusions

If you are using Promises as glue for the surface of your API then these tests really do not affect you. Even if you've built your web application using promises you may still not be affected by the findings as long as you don't have highly repetitive functions.

If a functions that resolves using a Promise will get called multiple times per given moment, then you need to take a pause and consider all your options.

In highly repetitive functions, When.js, the best performing and Promises/A+ compliant library, will finish resolving 4x times slower than Async.

Memory consumption is something that cannot be ignored either. Both When.js and Q will consume 8x to 11x times the memory since node started running. So when the node process started it consumed a total of 3,528,880 bytes of memory, when all 20x500 loop runs finished the memory count was at 41,115,800 bytes. This issue alone warrants an equivalent dive into why this happens, why setTimeout blows everything up in terms of memory consumption and what are the best practices for keeping the memory footprint low.

To conclude the story about why all this started, i switched the Promises dependency in Kickq to When.js v1.8.1 as it performs similarly to vanilla callbacks and all stress & performance tests now pass with an execution time of 2ms or less.

Find the comments for this article in this Reddit post