Benchmarks#
Overview#
Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):
The first benchmark is based on
rand()
callsThe second benchmark is based on
atoi()
callsThe third benchmark is based on Raylib
![]() |
![]() |
These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.
Linux x86_64#
The results presented below were measured on my x86_64 Linux machine (Intel® Core™ i5-4460).
rand results#
This test is based around repeated calls to a simple standard C function rand
, and has three implementations:
the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
the second one calls rand through Koffi
the third one uses the official Node.js FFI implementation, node-ffi-napi
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
rand_napi |
842 ns |
x1.00 |
(ref) |
rand_koffi |
1114 ns |
x0.76 |
+32% |
rand_node_ffi |
44845 ns |
x0.02 |
+5224% |
Because rand is a pretty small function, the FFI overhead is clearly visible.
atoi results#
This test is similar to the rand one, but it is based on atoi
, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
atoi_napi |
921 ns |
x1.00 |
(ref) |
atoi_koffi |
1357 ns |
x0.68 |
+47% |
atoi_node_ffi |
152550 ns |
x0.006 |
+16472% |
Because atoi is a pretty small function, the FFI overhead is clearly visible.
Raylib results#
This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:
Baseline: Full C++ version of the code (no JS)
node-raylib: This is a native wrapper implemented with N-API
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
raylib_cc |
215.7 µs |
x1.20 |
-17% |
raylib_node_raylib |
258.9 µs |
x1.00 |
(ref) |
raylib_koffi |
311.6 µs |
x0.83 |
+20% |
raylib_node_ffi |
928.4 µs |
x0.28 |
+259% |
Windows x86_64#
The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).
rand results#
This test is based around repeated calls to a simple standard C function rand
, and has three implementations:
the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
the second one calls rand through Koffi
the third one uses the official Node.js FFI implementation, node-ffi-napi
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
rand_napi |
964 ns |
x1.00 |
(ref) |
rand_koffi |
1274 ns |
x0.76 |
+32% |
rand_node_ffi |
42300 ns |
x0.02 |
+4289% |
Because rand is a pretty small function, the FFI overhead is clearly visible.
atoi results#
This test is similar to the rand one, but it is based on atoi
, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.
The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
atoi_napi |
1415 ns |
x1.00 |
(ref) |
atoi_koffi |
2193 ns |
x0.65 |
+55% |
atoi_node_ffi |
168300 ns |
x0.008 |
+11792% |
Because atoi is a pretty small function, the FFI overhead is clearly visible.
Raylib results#
This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:
node-raylib (baseline): This is a native wrapper implemented with N-API
raylib_cc: C++ implementation of the benchmark, without any Javascript
Benchmark |
Iteration time |
Relative performance |
Overhead |
---|---|---|---|
raylib_cc |
211.8 µs |
x1.25 |
-20% |
raylib_node_raylib |
264.4 µs |
x1.00 |
(ref) |
raylib_koffi |
318.9 µs |
x0.83 |
+21% |
raylib_node_ffi |
1146.2 µs |
x0.23 |
+334% |
Please note that in order to get fair numbers for raylib_node_raylib, it was recompiled with clang-cl before running the benchmark with the following commands:
1cd node_modules\raylib
2rmdir /S /Q bin build
3npx cmake-js compile -t ClangCL
Running benchmarks#
Open a console, go to koffi/benchmark
and run ../../cnoke/cnoke.js
(or node ..\..\cnoke\cnoke.js
on Windows) before doing anything else.
Please note that all benchmark results are made with Clang-built binaries.
1cd koffi/benchmark
2node ../../cnoke/cnoke.js --prefer-clang
Once everything is built and ready, run:
1node benchmark.js