Completely Unscientific Benchmarks

There are three kinds of lies: lies, damned lies, and statistics.

For this benchmark we implemented Treap
in a few classic (C++, Java, Python) and hyped (JavaScript, Kotlin, Swift, Rust)
programming languages and tested their performance on Linux, Mac OS, and
Windows (all of them running on different hardware, so the results should not
be compared between platforms).

This turned out to be a good benchmark of memory-intensive operations, which
should have been pushed memory management implementations to their edge.

First, we tried to play by the rules of the garbage-collected languages, thus
there are "ref-counted" versions of implementations for C++ and Rust, but then
we still wanted to compare the results with idiomatic (a.k.a. common practices)
implementations for C++ ("raw-pointers") and Rust ("idiomatic").

I must say that most of the implementations (except for C++, Haskell, and OCaml)
were implemented by mostly adapting the syntax from the very first
implementation of the algorithm in Kotlin. Even Rust, which is considered to
have the steepest learning curve among the tested languages, didn't require any
"black magic" (the solution does not require either unsafe code or lifetime
annotations). C++ was implemented separately, so it has a few shortcuts, and
thus it might be not a completely fair comparison (I will try to implement
"fair" C++ solution and also "C++"-like Rust solution to see if the performance
can be on par).

Metrics

We define the "naive" implementations as those which a developer with enough
experience in a given language would implement as a baseline "good enough"
solution where correctness is more important than performance.

However, experienced developers in system programming languages (e.g. C, C++, D)
tend to work comfortably with raw pointers, and that makes the comparison of the
solutions only by speed and memory consumption unfair. High-level abstractions
tend to introduce some performance hit in exchange for safety and
expressiveness. Thus, we added other metrics: "Expressiveness" (1 - pure magic,
10 - easy to get started and express your intent) and "Maintenance Complexity"
(1 - easy to maintain, 5 - ugly yet safe, 6-10 - hard to keep it right, i.e.
risky). The ease of maintenance is estimated for a big project using the given
language and the given approach.

Thus, here are the metrics:

Expressiveness (e12s), scores from 1 to 10 - higher value is better
(keep in mind that this is a subjective metric based on the author's
experience!)
Maintenance Complexity (M.C.), scores from 1 to 10 - smaller value is
better (keep in mind that this is a subjective metric based on the author's
experience!)
Real Time, seconds - smaller value is better
Slowdown Time (relative speed compared to the best tuned solution) - smaller
value is better
Memory, megabytes - smaller value is better
Binary Size, megabytes - smaller value is better

Measurements

To measure time we used time util on Mac OS and Windows (msys2 environment),
and cgmemtime on Linux.

Memory measurement was only available on Linux with cgmemtime util, which
leverages CGroup capabilities to capture the high-water RSS+CACHE memory usage,
and given the limitations of cgroup subsystem (it counts caches and loaded
shared objects unless they are already cached or loaded by other processes),
we take the lowest memory footprint among all the executions.

Results

Originally, this benchmark had a goal to implement the same "natural" and
"naive" API in all the languages with exception to C++, which would represent
a "bare metal" performance. Over time, we received optimized solutions in other
languages, but it doesn't seem fair to put them on the same scoreboard. Thus,
even though, all the solutions implement the same algorithm, they were created
with performance in mind and received quite an intensive profiling and tunning,
and that is why they will be presented in a separate scoreboard.

All tables are sorted in an alphabetical order.

名称与所有者	frol/completely-unscientific-benchmarks
主编程语言	C++
编程语言	Java (语言数: 27)
平台	Linux, Mac, Windows
许可证	Apache License 2.0

创建于	2018-05-12 08:32:39
推送于	2021-06-22 14:21:55
最后一次提交	2020-04-09 20:01:03
发布数	0

星数	551
关注者数	24
派生数	68
提交数	117
已启用问题?
问题数	27
打开的问题数	12
拉请求数	48
打开的拉请求数	4
关闭的拉请求数	12

已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?

Completely Unscientific Benchmarks

Github星跟踪图

Completely Unscientific Benchmarks

Metrics

Measurements

Results

"Naive" Implementations Scoreboard

Linux (Arch Linux, x64, Intel Core i7-4710HQ CPU)

主要指标