Completely Unscientific Benchmarks

There are three kinds of lies: lies, damned lies, and statistics.

For this benchmark we implemented Treap
in a few classic (C++, Java, Python) and hyped (JavaScript, Kotlin, Swift, Rust)
programming languages and tested their performance on Linux, Mac OS, and
Windows (all of them running on different hardware, so the results should not
be compared between platforms).

This turned out to be a good benchmark of memory-intensive operations, which
should have been pushed memory management implementations to their edge.

First, we tried to play by the rules of the garbage-collected languages, thus
there are "ref-counted" versions of implementations for C++ and Rust, but then
we still wanted to compare the results with idiomatic (a.k.a. common practices)
implementations for C++ ("raw-pointers") and Rust ("idiomatic").

I must say that most of the implementations (except for C++, Haskell, and OCaml)
were implemented by mostly adapting the syntax from the very first
implementation of the algorithm in Kotlin. Even Rust, which is considered to
have the steepest learning curve among the tested languages, didn't require any
"black magic" (the solution does not require either unsafe code or lifetime
annotations). C++ was implemented separately, so it has a few shortcuts, and
thus it might be not a completely fair comparison (I will try to implement
"fair" C++ solution and also "C++"-like Rust solution to see if the performance
can be on par).

Metrics

We define the "naive" implementations as those which a developer with enough
experience in a given language would implement as a baseline "good enough"
solution where correctness is more important than performance.

However, experienced developers in system programming languages (e.g. C, C++, D)
tend to work comfortably with raw pointers, and that makes the comparison of the
solutions only by speed and memory consumption unfair. High-level abstractions
tend to introduce some performance hit in exchange for safety and
expressiveness. Thus, we added other metrics: "Expressiveness" (1 - pure magic,
10 - easy to get started and express your intent) and "Maintenance Complexity"
(1 - easy to maintain, 5 - ugly yet safe, 6-10 - hard to keep it right, i.e.
risky). The ease of maintenance is estimated for a big project using the given
language and the given approach.

Thus, here are the metrics:

Expressiveness (e12s), scores from 1 to 10 - higher value is better
(keep in mind that this is a subjective metric based on the author's
experience!)
Maintenance Complexity (M.C.), scores from 1 to 10 - smaller value is
better (keep in mind that this is a subjective metric based on the author's
experience!)
Real Time, seconds - smaller value is better
Slowdown Time (relative speed compared to the best tuned solution) - smaller
value is better
Memory, megabytes - smaller value is better
Binary Size, megabytes - smaller value is better

Measurements

To measure time we used time util on Mac OS and Windows (msys2 environment),
and cgmemtime on Linux.

Memory measurement was only available on Linux with cgmemtime util, which
leverages CGroup capabilities to capture the high-water RSS+CACHE memory usage,
and given the limitations of cgroup subsystem (it counts caches and loaded
shared objects unless they are already cached or loaded by other processes),
we take the lowest memory footprint among all the executions.

Results

Originally, this benchmark had a goal to implement the same "natural" and
"naive" API in all the languages with exception to C++, which would represent
a "bare metal" performance. Over time, we received optimized solutions in other
languages, but it doesn't seem fair to put them on the same scoreboard. Thus,
even though, all the solutions implement the same algorithm, they were created
with performance in mind and received quite an intensive profiling and tunning,
and that is why they will be presented in a separate scoreboard.

All tables are sorted in an alphabetical order.

Name With Owner	frol/completely-unscientific-benchmarks
Primary Language	C++
Program language	Java (Language Count: 27)
Platform	Linux, Mac, Windows
License:	Apache License 2.0

Created At	2018-05-12 08:32:39
Pushed At	2021-06-22 14:21:55
Last Commit At	2020-04-09 20:01:03
Release Count	0

Stargazers Count	550
Watchers Count	25
Fork Count	67
Commits Count	117
Has Issues Enabled
Issues Count	27
Issue Open Count	12
Pull Requests Count	48
Pull Requests Open Count	4
Pull Requests Close Count	12

Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private

Completely Unscientific Benchmarks

Github stars Tracking Chart

Completely Unscientific Benchmarks

Metrics

Measurements

Results

"Naive" Implementations Scoreboard

Linux (Arch Linux, x64, Intel Core i7-4710HQ CPU)

Main metrics