despacer

C library to remove white space from strings as fast as possible

Github星跟蹤圖

despacer

Build Status

Fast C library to remove white space from strings (also called "strip white space")

We want to remove the space (' ') and the line feeds characters ('\n', '\r') from a string
as fast as possible. To avoid unnecessary allocations, we wish to do the processing in-place.

Let us consider any array of bytes representing a string in one of these encodings:

  • UTF-8
  • ASCII
  • Any of the 8-bit ASCII supersets such as Latin1

How fast can we go?

Blog post:
http://lemire.me/blog/2017/01/20/how-quickly-can-you-remove-spaces-from-a-string/

Usage:

make
./despacebenchmark

Note that clang seems to give better results than gcc.

Possible results...

$ ./despacebenchmark
pointer alignment = 16 bytes
memcpy(tmpbuffer,buffer,N):  0.111328 cycles / ops
countspaces(buffer, N):  3.687500 cycles / ops
despace(buffer, N):  5.337891 cycles / ops
faster_despace(buffer, N):  1.689453 cycles / ops
despace64(buffer, N):  2.429688 cycles / ops
despace_to(buffer, N, tmpbuffer):  5.585938 cycles / ops
avx2_countspaces(buffer, N):  0.367188 cycles / ops
avx2_despace(buffer, N):  3.990234 cycles / ops
avx2_despace_branchless(buffer, N):  0.593750 cycles / ops
avx2_despace_branchless_u2(buffer, N):  0.535156 cycles / ops
sse4_despace(buffer, N):  0.734375 cycles / ops
sse4_despace_branchless(buffer, N):  0.384766 cycles / ops
sse4_despace_branchless_u2(buffer, N):  0.380859 cycles / ops
sse4_despace_branchless_u4(buffer, N):  0.351562 cycles / ops
sse4_despace_trail(buffer, N):  1.142578 cycles / ops
sse42_despace_branchless(buffer, N):  0.763672 cycles / ops
sse42_despace_branchless_lookup(buffer, N):  0.673828 cycles / ops
sse42_despace_to(buffer, N,tmpbuffer):  1.703125 cycles / ops

This indicates how many cycles are used to despace one byte.

Related work

主要指標

概覽
名稱與所有者lemire/despacer
主編程語言C
編程語言Makefile (語言數: 6)
平台
許可證BSD 3-Clause "New" or "Revised" License
所有者活动
創建於2017-01-19 18:50:11
推送於2024-09-05 23:57:39
最后一次提交2024-09-05 19:57:38
發布數0
用户参与
星數152
關注者數10
派生數15
提交數114
已啟用問題?
問題數9
打開的問題數3
拉請求數9
打開的拉請求數3
關閉的拉請求數3
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?