Promises/A+ Performance Hits You Should Be Aware Of

29 Apr 2013

BEWARE This article is old, it’s 2013 old. Since then things have changed and a winner has prevailed, Bluebird is a Promises Library build with performance in mind and when actually tested with the benchmarks of this article it did even better than Async!

The Promises/A+ specification is a fresh and very interesting way of dealing with the asynchronous nature of Javascript. It also provides a sensible way to deal with error handling and exceptions. In this article we will go through the performance hits you should be aware of and as a side-effect do a comparison between the two most popular Promises/A+ implementations, When and Q and how they compare to Async, the lowest abstraction you can get on asynchronicity.

If you are in a hurry, just skip to the Conclusions.

The Case

My motivation for looking deeper into the performance of Promises/A+ was a Job Queuing system i’ve been working on named Kickq. It is expected that the system will get hammered when used on production so stress testing was warranted. After stubbing all the database interactions, essentially making the operation of job creation synchronous, I was getting odd performance results.

The test was simple, create 500 jobs in a loop and measure how long it takes for all the jobs to finish.

The measurements were in the ~550ms range and my eyeballs started to roll. “That’s a synchronous operation, it should finish in less than 3ms, WHAT THE????!?!”. After taking a few moments to let it sip in the suspect was found, it was Promises. I used them as the only pattern to handle asynchronous ops and callbacks throughout the whole project. Brian Cavalier, one of the authors of When.js, helped me pinpoint the real culprit, it was the tick:

Promises/A+ Specification, Note 4.1 In practical terms, an implementation must use a mechanism such as setTimeout, setImmediate, or process.nextTick to ensure that onFulfilled and onRejected are not invoked in the same turn of the event loop as the call to then to which they are passed.

In other words, Promises, per the specification, must be resolved Asynchronously! That comes with a cost, a heavy one apparently.

In the process of studying performance I had to create a performance library, poor mans profiling. And a benchmark test for Promises/A+ implementations that’s already used to optimize the future versions of When.

Creating The Promises/A+ Benchmark

I tried to broaden the definition of the test case. If an application uses the Promises pattern as the only way to manage how the internal parts interact, we can make a few assumptions:

There will be a series of promises chained together, representing the various operations that will be performed by your application.
The Deferred Object is used on each link of the chain to control resolution and how the promise object is exposed.
Throughout the whole chain of promises there can be operations that are actually synchronous, we will measure all cases.

The number of how many promises will be chained was arbitrarily chosen to be 7, it represents a mean call-stack of an average operation. Any app logic is stripped, we only measure how long it takes to resolve a chain of 7 promises:

// "Prom" is your promises implementation.
// Must support the .defer() method.
app.promise = function(Prom) {
  var def = Prom.defer();
  var def2 = Prom.defer();
  // ... def3, def4...
  var def7 = Prom.defer();

  var getDef3 = function() { return def3.promise; };
  // ... chain chain ...
  var getDef7 = function() { return def7.promise; };

  // chain them
  def2.promise
    .then(getDef3)
    // ... chain chain ...
    .then(getDef7)
    .then(def.resolve);

  // in the first set of benchmarks, promises are resolved synchronously...
  def2.resolve();
  // def3.resolve ....
  def7.resolve();

  // in the second set of benchmarks, one of the promises is resolved async like so:
  setTimeout(def2.resolve);

  return def.promise;
};

Find a snapshot of the actual app in this commit.

Update 11/Jun/2013

The post has been updated to include one more test and rename what used to be the Async test to Mixed. Based on an issue raised by Domenic, a few points were made that need to be addressed and implemented in the test. Thus the article update…

Promises are built to handle Asynchronous functions

Yes indeed. Libraries or ideas are merely used as their original creator intented. There are two major reasons why you may use promises synchronously.

In real world cases you have conditionals, in a Promise returning function a condition may force the function to finish synchronously, thus returning a Resolved Promise.
The second case has to do with design choices, maintainable and scalable code. Through time and again, one of the biggest pain points with large codebases, in regard to control flow and asynchronicity is the fact that methods and functions evolve and get refactored.

Over time, what used to be a synchronous function becomes async and vice versa. Each change, means a hell of a lot has to be refactored to match that evolution.

I found Promises to solve that design problem by blanketing everything with a returned promise and not caring about how the function will evolve over time in respect to sync / async. This practice also enables better code compositionality and better abstractions.

Therefore, it is totally expected, for a number of Promises to resolve synchronously, for any of the above reasons.

The Perf Test Resolves Synchronous Functions

In effect Async resolves noop funcs.

Yes. A request was made for all the stub functions to be asynchronous. That request has been implemented and you can view the results bellow, named as Async.

The Added nextTick in Promises/A+ should not be always added

Domenic painstakingly explained to me the subtle details of the Promises/A+ spec that requires asynchronous resolution. Consider this simple case:

function asyncOp(){
  var def = when.defer();

  // emulate an async operation (e.g. file read)
  setTimeout(function(){
    def.resolve();
    console.log('X');
  });

  return def.promise;
}

asyncOp().then(function(){
  console.log('Y');
});

console.log('Z');

You’d typically expect that you’ll see printed Z, X, Y. But that is not the case as per the spec. The spec only requires that resolution “is not invoked in the same turn of the event loop as the call to then to which they are passed.”.

If it is another turn the Promise implementation can resolve the promise synchronously thus printing Z, Y, X. A very subtle detail, and one that according to Domenic not all implementations have really implemented.

Time Benchmark

The app.promise() function gets invoked asynchronously in a loop of n times. We mark the time down to microseconds, using the node-microtime library. The time is marked in the following events:

Before the loop starts.
Right before app.promise() is invoked, (gets marked n times).
Right after app.promise() is resolved, (gets marked n times).
After all created promises resolve.

The total execution time and the difference between these marks is what is measured, see bellow a test run using just 5 loops:

No.   JS Timestamp     Diff       Message
   1366657964509117 [NaN ms]   Start
   1366657964509342 [0.225 ms] after for
   1366657964509607 [0.265 ms] Creating promise:0
   1366657964509704 [0.097 ms] Creating promise:1
   1366657964509752 [0.048 ms] Creating promise:2
   1366657964509766 [0.014 ms] Creating promise:3
   1366657964509792 [0.026 ms] Creating promise:4
   1366657964510264 [0.472 ms] Promise resolved:0
   1366657964510274 [0.010 ms] Promise resolved:1
   1366657964510279 [0.005 ms] Promise resolved:2
  1366657964510285 [0.006 ms] Promise resolved:3
  1366657964510288 [0.003 ms] Promise resolved:4
  1366657964510537 [0.249 ms] Finish

Notice the significant delay in Mark No 7, it takes a while from the last app.promise() invocation (No 6) till we hear back from the first fulfilled promise.

The difference of Mark No 7 is the main metric that is used to benchmark promises, it actually is the First Resolved Promise.

Memory Benchmark

Memory consumption is a bit dodgy to measure during runtime. To take memory consumption snapshots the process.memoryUsage() method is invoked and the property heapUsed is logged.

process.memoryUsage();

// outputs:
{
  rss: 14671872,
  heapTotal: 6131200,
  heapUsed: 3370296
}

A heapUsed measurement is taken when the benchmark starts running. From there on every time all promises are resolved another measurement is taken. Comparing the difference between the two is what we benchmark and compare between the packages.

var firstSnapshot = process.memoryUsage().heapUsed;
// firstSnapshot == 3370296

/* ... app.promise() runs and finishes */

var finishSnapshot = process.memoryUsage().heapUsed;
// finishSnapshot == 6645932

var diff = finishSnapshot - firstSnapshot;
// diff == 3275636 or +97% from the first measurement

Now beware, we are not taking these measurements in face value. What we will only be observing is, given the same test, what the differences are between the various implementations.

The Results

The tests were run for 10, 100, 500 and 1,000 loops. Each set of loops was run 20 times to normalize the results and the means were taken from these 20 runs. The libraries used for measuring are:

Async was used to emulate vanilla JS using callbacks. A special, but of equivalent logic, test app was used
Q is one of the most prominent Promises/A+ implementations. v0.9.5 was used.
When.js is the other most prominent Promises/A+ implementation. Two versions of when.js were used:
- v1.8.1 When.js resolved Promises Synchronously against the Promises/A+ spec.
- v2.1.0 Current version of When.js, resolves Promises asynchronously.
Deferred v0.6.3 Promises in a simple and powerful way. Implementation originally inspired by Kris Kowal’s Q.
Promise v3.0.1 Bare bones Promises/A+ implementation.

As more libraries are added this article will get updated with how they performed.

Updates
- May-01-13 Added Deferred library. Resolves promises synchronously.
- Jun-11-13 Added Promise library. Resolves promises asynchronously.

There are three sets of tests done:

All stub functions resolve synchronously.
Half the stub functions resolve synchronously and the other hald asynchronously
All stub functions resolve asynchronously.

Sets 1 and 3 can be considered edge cases. The most close to reality test case is 2, Mixed.

Difference to First Resolved Promise, 500 Loops

Perf Type	Async	When 2.1.0	Q 0.9.5	Promise 3.0.1
Sync Diff	0.01ms	36.62ms	186.43ms	63.96ms
Mixed Diff	5.37ms	41.78ms	226.34ms	83.83ms
Async Diff	22.42ms	58.18ms	241.80ms	93.68ms
Sync Diff vs AsyncLib	1x	3,662x	18,643x	6,396x
Mixed Diff vs AsyncLib	1x	7.78x	42.15x	15.61x
Async Diff vs AsyncLib	1x	2.60x	10.79x	4.18x

Libraries When.js v1.8.1 and Deferred are not included in this table because they resolve promises synchronously. This difference makes the Diff metric inapplicable.

Total Time of execution, 500 Loops

Perf Type	Async	When 1.8.1	When 2.1.0	Q 0.9.5	Deferred 0.6.3	Promise 3.0.1
Sync Total	5.15ms	12.35ms	72.35ms	301.47ms	71.25ms	80.50ms
Mixed Total	18.94ms	40.57ms	80.21ms	325.49ms	94.58ms	95.67ms
Async Total	35.70ms	50.63ms	90.52ms	337.82ms	105.87ms	107.01ms
Sync Total vs AsyncLib	1x	2.40x	14.05x	58.54x	13.83x	15.63x
Mixed Total vs AsyncLib	1x	2.14x	4.23x	17.19x	4.99x	5.05x
Async Total vs AsyncLib	1x	1.42x	2.54x	9.46x	2.97x	3.00x

Average Memory Difference - Single 500 Loop Runs

Pert Type	Async	When 1.8.1	When 2.0.1	When 2.1.x	Q	Q longStack=0	Deferred
Sync	113.29%	160.98%	840.21%	866.88%	1106.67%	684.56%	354.07%
Async	159.29%	458.44%	811.32%	834.63%	1110.21%	691.41%	429.18%

Charts

Total Time to Resolve, 500 Loops

Promises, Total Time to Resolve, 500 Loops

Memory Consumption

Promises, Memory Consumption

Checkout the updated results in this Google Spreadsheet, you can find the first version results in this spreadsheet.

Comments On The Findings

Synchronous resolution of all 7 chained promises is most likely an unnatural case, much like Asynchronous on all 7 cases is. Using Mixed asynchronous resolution in the stub funcs seems to improve the total time of execution both in Q and When.js, Async will suffer a minor penalty.

Update: @domenic rightly pointed out a faulty tweak i attempted to do with Q. I did not properly enable the option to zero the long stack traces. With this tweak enabled Q will perform up to 9x times faster! The Charts and Tables have been updated.

The unsung winner here is When v1.8.1, as already mentioned, v1.8.1 contrary to the spec, will resolve the promises synchronously. The next version of When.js, that resolves promises Asynchronously, v2.0.1 performs ~5x slower.

Memory consumption is shaky at best. The tests were perform on single node runs to avoid the previous runs contaminating the measurements. The node option --expose-gc was used and global.gc() was invoked in each of the 20 master loops. Still, this does not guarantee that the garbage collector will actually kick in. What we can merely observe is, given the same method of measurement, how each package performed individually.

Conclusions

If you are using Promises as glue for the surface of your API then these tests really do not affect you. Even if you’ve built your web application using promises you may still not be affected by the findings as long as you don’t have highly repetitive functions.

If a functions that resolves using a Promise will get called multiple times per given moment, then you need to take a pause and consider all your options.

In highly repetitive functions, When.js, the best performing and Promises/A+ compliant library, will finish resolving 4x times slower than Async.

Memory consumption is something that cannot be ignored either. Both When.js and Q will consume 8x to 11x times the memory since node started running. So when the node process started it consumed a total of 3,528,880 bytes of memory, when all 20x500 loop runs finished the memory count was at 41,115,800 bytes. This issue alone warrants an equivalent dive into why this happens, why setTimeout blows everything up in terms of memory consumption and what are the best practices for keeping the memory footprint low.

To conclude the story about why all this started, i switched the Promises dependency in Kickq to When.js v1.8.1 as it performs similarly to vanilla callbacks and all stress & performance tests now pass with an execution time of 2ms or less.

Find the comments for this article in this Reddit post