Promises/A+ Performance Hits You Should Be Aware Of
BEWARE This article is old, it’s 2013 old. Since then things have changed and a winner has prevailed, Bluebird is a Promises Library build with performance in mind and when actually tested with the benchmarks of this article it did even better than Async!
The Promises/A+ specification is a fresh and very interesting way of dealing with the asynchronous nature of Javascript. It also provides a sensible way to deal with error handling and exceptions. In this article we will go through the performance hits you should be aware of and as a side-effect do a comparison between the two most popular Promises/A+ implementations, When and Q and how they compare to Async, the lowest abstraction you can get on asynchronicity.
If you are in a hurry, just skip to the Conclusions.
The Case
My motivation for looking deeper into the performance of Promises/A+ was a Job Queuing system i’ve been working on named Kickq. It is expected that the system will get hammered when used on production so stress testing was warranted. After stubbing all the database interactions, essentially making the operation of job creation synchronous, I was getting odd performance results.
The test was simple, create 500 jobs in a loop and measure how long it takes for all the jobs to finish.
The measurements were in the ~550ms range and my eyeballs started to roll. “That’s a synchronous operation, it should finish in less than 3ms, WHAT THE????!?!”. After taking a few moments to let it sip in the suspect was found, it was Promises. I used them as the only pattern to handle asynchronous ops and callbacks throughout the whole project. Brian Cavalier, one of the authors of When.js, helped me pinpoint the real culprit, it was the tick:
Promises/A+ Specification, Note 4.1 In practical terms, an implementation must use a mechanism such as setTimeout, setImmediate, or process.nextTick to ensure that onFulfilled and onRejected are not invoked in the same turn of the event loop as the call to then to which they are passed.
In other words, Promises, per the specification, must be resolved Asynchronously! That comes with a cost, a heavy one apparently.
In the process of studying performance I had to create a performance library, poor mans profiling. And a benchmark test for Promises/A+ implementations that’s already used to optimize the future versions of When.
Creating The Promises/A+ Benchmark
I tried to broaden the definition of the test case. If an application uses the Promises pattern as the only way to manage how the internal parts interact, we can make a few assumptions:
- There will be a series of promises chained together, representing the various operations that will be performed by your application.
- The Deferred Object is used on each link of the chain to control resolution and how the promise object is exposed.
- Throughout the whole chain of promises there can be operations that are actually synchronous, we will measure all cases.
The number of how many promises will be chained was arbitrarily chosen to be 7, it represents a mean call-stack of an average operation. Any app logic is stripped, we only measure how long it takes to resolve a chain of 7 promises:
Find a snapshot of the actual app in this commit.
Update 11/Jun/2013
The post has been updated to include one more test and rename what used to be the Async test to Mixed. Based on an issue raised by Domenic, a few points were made that need to be addressed and implemented in the test. Thus the article update…
Promises are built to handle Asynchronous functions
Yes indeed. Libraries or ideas are merely used as their original creator intented. There are two major reasons why you may use promises synchronously.
- In real world cases you have conditionals, in a Promise returning function a condition may force the function to finish synchronously, thus returning a Resolved Promise.
- The second case has to do with design choices, maintainable and scalable code. Through time and again, one of the biggest pain points with large codebases, in regard to control flow and asynchronicity is the fact that methods and functions evolve and get refactored.
Over time, what used to be a synchronous function becomes async and vice versa. Each change, means a hell of a lot has to be refactored to match that evolution.
I found Promises to solve that design problem by blanketing everything with a returned promise and not caring about how the function will evolve over time in respect to sync / async. This practice also enables better code compositionality and better abstractions.
Therefore, it is totally expected, for a number of Promises to resolve synchronously, for any of the above reasons.
The Perf Test Resolves Synchronous Functions
In effect Async resolves noop funcs.
Yes. A request was made for all the stub functions to be asynchronous. That request has been implemented and you can view the results bellow, named as Async.
The Added nextTick in Promises/A+ should not be always added
Domenic painstakingly explained to me the subtle details of the Promises/A+ spec that requires asynchronous resolution. Consider this simple case:
You’d typically expect that you’ll see printed Z, X, Y. But that is not the case as per the spec. The spec only requires that resolution “is not invoked in the same turn of the event loop as the call to then to which they are passed.”.
If it is another turn the Promise implementation can resolve the promise synchronously thus printing Z, Y, X. A very subtle detail, and one that according to Domenic not all implementations have really implemented.
Time Benchmark
The app.promise()
function gets invoked asynchronously in a loop of n times. We mark the time down to microseconds, using the node-microtime library. The time is marked in the following events:
- Before the loop starts.
- Right before
app.promise()
is invoked, (gets marked n times). - Right after
app.promise()
is resolved, (gets marked n times). - After all created promises resolve.
The total execution time and the difference between these marks is what is measured, see bellow a test run using just 5 loops:
No. JS Timestamp Diff Message
0. 1366657964509117 [NaN ms] Start
1. 1366657964509342 [0.225 ms] after for
2. 1366657964509607 [0.265 ms] Creating promise:0
3. 1366657964509704 [0.097 ms] Creating promise:1
4. 1366657964509752 [0.048 ms] Creating promise:2
5. 1366657964509766 [0.014 ms] Creating promise:3
6. 1366657964509792 [0.026 ms] Creating promise:4
7. 1366657964510264 [0.472 ms] Promise resolved:0
8. 1366657964510274 [0.010 ms] Promise resolved:1
9. 1366657964510279 [0.005 ms] Promise resolved:2
10. 1366657964510285 [0.006 ms] Promise resolved:3
11. 1366657964510288 [0.003 ms] Promise resolved:4
12. 1366657964510537 [0.249 ms] Finish
Notice the significant delay in Mark No 7, it takes a while from the last app.promise()
invocation (No 6) till we hear back from the first fulfilled promise.
The difference of Mark No 7 is the main metric that is used to benchmark promises, it actually is the First Resolved Promise.
Memory Benchmark
Memory consumption is a bit dodgy to measure during runtime. To take memory consumption snapshots the process.memoryUsage()
method is invoked and the property heapUsed
is logged.
A heapUsed
measurement is taken when the benchmark starts running. From there on every time all promises are resolved another measurement is taken. Comparing the difference between the two is what we benchmark and compare between the packages.
Now beware, we are not taking these measurements in face value. What we will only be observing is, given the same test, what the differences are between the various implementations.
The Results
The tests were run for 10, 100, 500 and 1,000 loops. Each set of loops was run 20 times to normalize the results and the means were taken from these 20 runs. The libraries used for measuring are:
- Async was used to emulate vanilla JS using callbacks. A special, but of equivalent logic, test app was used
- Q is one of the most prominent Promises/A+ implementations. v0.9.5 was used.
- When.js is the other most prominent Promises/A+ implementation. Two versions of when.js were used:
- v1.8.1 When.js resolved Promises Synchronously against the Promises/A+ spec.
- v2.1.0 Current version of When.js, resolves Promises asynchronously.
- Deferred v0.6.3 Promises in a simple and powerful way. Implementation originally inspired by Kris Kowal’s Q.
- Promise v3.0.1 Bare bones Promises/A+ implementation.
As more libraries are added this article will get updated with how they performed.
- Updates
There are three sets of tests done:
- All stub functions resolve synchronously.
- Half the stub functions resolve synchronously and the other hald asynchronously
- All stub functions resolve asynchronously.
Sets 1 and 3 can be considered edge cases. The most close to reality test case is 2, Mixed.
Difference to First Resolved Promise, 500 Loops
Perf Type | Async | When 2.1.0 | Q 0.9.5 | Promise 3.0.1 |
---|---|---|---|---|
Sync Diff | 0.01ms | 36.62ms | 186.43ms | 63.96ms |
Mixed Diff | 5.37ms | 41.78ms | 226.34ms | 83.83ms |
Async Diff | 22.42ms | 58.18ms | 241.80ms | 93.68ms |
Sync Diff vs AsyncLib | 1x | 3,662x | 18,643x | 6,396x |
Mixed Diff vs AsyncLib | 1x | 7.78x | 42.15x | 15.61x |
Async Diff vs AsyncLib | 1x | 2.60x | 10.79x | 4.18x |
Libraries When.js v1.8.1 and Deferred are not included in this table because they resolve promises synchronously. This difference makes the Diff metric inapplicable.
Total Time of execution, 500 Loops
Perf Type | Async | When 1.8.1 | When 2.1.0 | Q 0.9.5 | Deferred 0.6.3 | Promise 3.0.1 |
---|---|---|---|---|---|---|
Sync Total | 5.15ms | 12.35ms | 72.35ms | 301.47ms | 71.25ms | 80.50ms |
Mixed Total | 18.94ms | 40.57ms | 80.21ms | 325.49ms | 94.58ms | 95.67ms |
Async Total | 35.70ms | 50.63ms | 90.52ms | 337.82ms | 105.87ms | 107.01ms |
Sync Total vs AsyncLib | 1x | 2.40x | 14.05x | 58.54x | 13.83x | 15.63x |
Mixed Total vs AsyncLib | 1x | 2.14x | 4.23x | 17.19x | 4.99x | 5.05x |
Async Total vs AsyncLib | 1x | 1.42x | 2.54x | 9.46x | 2.97x | 3.00x |
Average Memory Difference - Single 500 Loop Runs
Pert Type | Async | When 1.8.1 | When 2.0.1 | When 2.1.x | Q | Q longStack=0 | Deferred |
---|---|---|---|---|---|---|---|
Sync | 113.29% | 160.98% | 840.21% | 866.88% | 1106.67% | 684.56% | 354.07% |
Async | 159.29% | 458.44% | 811.32% | 834.63% | 1110.21% | 691.41% | 429.18% |
Charts
Total Time to Resolve, 500 Loops
Memory Consumption
Checkout the updated results in this Google Spreadsheet, you can find the first version results in this spreadsheet.
Comments On The Findings
Synchronous resolution of all 7 chained promises is most likely an unnatural case, much like Asynchronous on all 7 cases is. Using Mixed asynchronous resolution in the stub funcs seems to improve the total time of execution both in Q and When.js, Async will suffer a minor penalty.
Update: @domenic rightly pointed out a faulty tweak i attempted to do with Q. I did not properly enable the option to zero the long stack traces. With this tweak enabled Q will perform up to 9x times faster! The Charts and Tables have been updated.
The unsung winner here is When v1.8.1, as already mentioned, v1.8.1 contrary to the spec, will resolve the promises synchronously. The next version of When.js, that resolves promises Asynchronously, v2.0.1 performs ~5x slower.
Memory consumption is shaky at best. The tests were perform on single node runs to avoid the previous runs contaminating the measurements. The node option --expose-gc
was used and global.gc()
was invoked in each of the 20 master loops. Still, this does not guarantee that the garbage collector will actually kick in. What we can merely observe is, given the same method of measurement, how each package performed individually.
Conclusions
If you are using Promises as glue for the surface of your API then these tests really do not affect you. Even if you’ve built your web application using promises you may still not be affected by the findings as long as you don’t have highly repetitive functions.
If a functions that resolves using a Promise will get called multiple times per given moment, then you need to take a pause and consider all your options.
In highly repetitive functions, When.js, the best performing and Promises/A+ compliant library, will finish resolving 4x times slower than Async.
Memory consumption is something that cannot be ignored either. Both When.js and Q will consume 8x to 11x times the memory since node started running. So when the node process started it consumed a total of 3,528,880 bytes of memory, when all 20x500 loop runs finished the memory count was at 41,115,800 bytes. This issue alone warrants an equivalent dive into why this happens, why setTimeout
blows everything up in terms of memory consumption and what are the best practices for keeping the memory footprint low.
To conclude the story about why all this started, i switched the Promises dependency in Kickq to When.js v1.8.1 as it performs similarly to vanilla callbacks and all stress & performance tests now pass with an execution time of 2ms or less.