Benchmarks#

Overview#

Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):

  • The first benchmark is based on rand() calls

  • The second benchmark is based on atoi() calls

  • The third benchmark is based on Raylib

Linux x86_64 performance Windows x86_64 performance

These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.

Linux x86_64#

The results presented below were measured on my x86_64 Linux machine (Intel® Core™ i5-4460).

rand results#

This test is based around repeated calls to a simple standard C function rand, and has three implementations:

  • the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)

  • the second one calls rand through Koffi

  • the third one uses the official Node.js FFI implementation, node-ffi-napi

Benchmark

Iteration time

Relative performance

Overhead

rand_napi

700 ns

x1.00

(ref)

rand_koffi

1152 ns

x0.61

+64%

rand_node_ffi

32750 ns

x0.02

+4576%

Because rand is a pretty small function, the FFI overhead is clearly visible.

atoi results#

This test is similar to the rand one, but it is based on atoi, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

Benchmark

Iteration time

Relative performance

Overhead

atoi_napi

1028 ns

x1.00

(ref)

atoi_koffi

1730 ns

x0.59

+68%

atoi_node_ffi

121670 ns

x0.008

+11738%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

Raylib results#

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

  • Baseline: Full C++ version of the code (no JS)

  • node-raylib: This is a native wrapper implemented with N-API

Benchmark

Iteration time

Relative performance

Overhead

raylib_cc

18.5 µs

x1.42

-30%

raylib_node_raylib

26.3 µs

x1.00

(ref)

raylib_koffi

28.0 µs

x0.94

+6%

raylib_node_ffi

87.0 µs

x0.30

+230%

Windows x86_64#

The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).

rand results#

This test is based around repeated calls to a simple standard C function rand, and has three implementations:

  • the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)

  • the second one calls rand through Koffi

  • the third one uses the official Node.js FFI implementation, node-ffi-napi

Benchmark

Iteration time

Relative performance

Overhead

rand_napi

859 ns

x1.00

(ref)

rand_koffi

1352 ns

x0.64

+57%

rand_node_ffi

35640 ns

x0.02

+4048%

Because rand is a pretty small function, the FFI overhead is clearly visible.

atoi results#

This test is similar to the rand one, but it is based on atoi, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):

Benchmark

Iteration time

Relative performance

Overhead

atoi_napi

1336 ns

x1.00

(ref)

atoi_koffi

2440 ns

x0.55

+83%

atoi_node_ffi

136890 ns

x0.010

+10144%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

Raylib results#

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

  • node-raylib (baseline): This is a native wrapper implemented with N-API

  • raylib_cc: C++ implementation of the benchmark, without any Javascript

Benchmark

Iteration time

Relative performance

Overhead

raylib_cc

18.2 µs

x1.50

-33%

raylib_node_raylib

27.3 µs

x1.00

(ref)

raylib_koffi

29.8 µs

x0.92

+9%

raylib_node_ffi

96.3 µs

x0.28

+253%

Running benchmarks#

Please note that all benchmark results on this page are made with Clang-built binaries.

1cd koffi
2node ../../cnoke/cnoke.js --prefer-clang
3
4cd koffi/benchmark
5node ../../cnoke/cnoke.js --prefer-clang

Once everything is built and ready, run:

1node benchmark.js