Benchmarks#
Overview#
Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):
The first benchmark is based on
rand()
callsThe second benchmark is based on
atoi()
callsThe third benchmark is based on Raylib
These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.
Linux x86_64#
The results presented below were measured on my x86_64 Linux machine (Intel® Core™ i5-4460).
rand results#
This test is based around repeated calls to a simple standard C function rand
, and has three implementations:
the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
the second one calls rand through Koffi
the third one uses the official Node.js FFI implementation, node-ffi-napi
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
rand_napi |
700 ns |
x1.00 |
(ref) |
rand_koffi |
1152 ns |
x0.61 |
+64% |
rand_node_ffi |
32750 ns |
x0.02 |
+4576% |
Because rand is a pretty small function, the FFI overhead is clearly visible.
atoi results#
This test is similar to the rand one, but it is based on atoi
, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
atoi_napi |
1028 ns |
x1.00 |
(ref) |
atoi_koffi |
1730 ns |
x0.59 |
+68% |
atoi_node_ffi |
121670 ns |
x0.008 |
+11738% |
Because atoi is a pretty small function, the FFI overhead is clearly visible.
Raylib results#
This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:
Baseline: Full C++ version of the code (no JS)
node-raylib: This is a native wrapper implemented with N-API
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
raylib_cc |
18.5 µs |
x1.42 |
-30% |
raylib_node_raylib |
26.3 µs |
x1.00 |
(ref) |
raylib_koffi |
28.0 µs |
x0.94 |
+6% |
raylib_node_ffi |
87.0 µs |
x0.30 |
+230% |
Windows x86_64#
The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).
rand results#
This test is based around repeated calls to a simple standard C function rand
, and has three implementations:
the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
the second one calls rand through Koffi
the third one uses the official Node.js FFI implementation, node-ffi-napi
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
rand_napi |
859 ns |
x1.00 |
(ref) |
rand_koffi |
1352 ns |
x0.64 |
+57% |
rand_node_ffi |
35640 ns |
x0.02 |
+4048% |
Because rand is a pretty small function, the FFI overhead is clearly visible.
atoi results#
This test is similar to the rand one, but it is based on atoi
, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.
The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
atoi_napi |
1336 ns |
x1.00 |
(ref) |
atoi_koffi |
2440 ns |
x0.55 |
+83% |
atoi_node_ffi |
136890 ns |
x0.010 |
+10144% |
Because atoi is a pretty small function, the FFI overhead is clearly visible.
Raylib results#
This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:
node-raylib (baseline): This is a native wrapper implemented with N-API
raylib_cc: C++ implementation of the benchmark, without any Javascript
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
raylib_cc |
18.2 µs |
x1.50 |
-33% |
raylib_node_raylib |
27.3 µs |
x1.00 |
(ref) |
raylib_koffi |
29.8 µs |
x0.92 |
+9% |
raylib_node_ffi |
96.3 µs |
x0.28 |
+253% |
Running benchmarks#
Please note that all benchmark results on this page are made with Clang-built binaries.
1cd koffi
2node ../../cnoke/cnoke.js --prefer-clang
3
4cd koffi/benchmark
5node ../../cnoke/cnoke.js --prefer-clang
Once everything is built and ready, run:
1node benchmark.js