rabin

node native addon for rabin fingerprinting data streams

  • 所有者: dat-ecosystem-archive/rabin
  • 平台:
  • 许可证:
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

rabin

Node native addon module (C/C++) for Rabin fingerprinting data streams.

Build Status
Build status

Note: This implementation is not currently used by DAT or maintained but it works and may come in handy in future.

Uses the implementation of Rabin fingerprinting from LBFS.

Rabin fingerprinting is useful for finding the chunks of a file that differ from a previous version. It's one implementation of a technique called "Content-defined chunking", meaning the chunk boundaries are determinstic to the content (as opposed to "fixed-sized chunking").

Theres a JavaScript API and an accompanying command-line tool.

JavaScript API

var createRabin = require('rabin')

createRabin can be used to create multiple fingerprinting streams

var rabin = createRabin()

rabin is a duplex stream. You write raw data in, and buffers chunked by rabin fingerprints will be written out.

JavaScript Example

// require and create an instance
var rabin = require('rabin')()

// pipe some data in
var rs = fs.createReadStream('somefile.dat')
rs.pipe(rabin)

// handle output chunks
rabin.on('data', function (chunk) {
  // chunks are created by taking your input data
  // and splitting on each rabin fingerprint found
})

CLI API

$ npm install rabin -g
$ rabin myfile.txt --bits=14 --min=8192 --max=32768 # defaults
{"length":12182,"offset":0,"hash":"5df6245b5897336ebf611d7f10fb90eea2d63c5b9ec9ad76dfb1ac72b8249dcb"}
{"length":13190,"offset":12182,"hash":"67d5aaac9cf7b8432cb3c8071d726dc38f1138957c30719f8b166116a90950a1"}
{"length":11609,"offset":25372,"hash":"976a0e3dc43de3abdf50b984a102c5fb7c2550e3dc5e44e4a8f7d4241276683b"}
{"length":10010,"offset":36981,"hash":"7145d10f93ea03e6c8b4dd5ab148e2c3c08f9c71bf71c7559dffdfcef48112c1"}
{"length":13623,"offset":46991,"hash":"76470d5047f9fb31bd75364d90355fdbf913aaa1df934251f43c894f01381f1b"}
{"length":8197,"offset":60614,"hash":"88abce05bc75f72cdafeabd5125eb46fa8f73eab2d75a29076aeb3f99ef35548"}
{"length":16242,"offset":68811,"hash":"08d60789c1e901d6a8e474aeb5de4746af1648e7f3a4ac7a3dba87d9e73fca56"}
{"length":14947,"offset":85053,"hash":"4224e6f4361fa8bdefb9d8e10ebd046e2869af2c44ea7e84c7efaeedd5423b30"}
average 12500

主要指标

概览
名称与所有者dat-ecosystem-archive/rabin
主编程语言C++
编程语言Python (语言数: 4)
平台
许可证
所有者活动
创建于2015-11-04 23:42:27
推送于2022-01-06 01:54:32
最后一次提交2022-01-06 02:54:29
发布数9
最新版本名称v2.0.0 (发布于 2019-05-14 09:10:16)
第一版名称v1.1.0 (发布于 2015-11-05 17:22:02)
用户参与
星数147
关注者数13
派生数22
提交数82
已启用问题?
问题数14
打开的问题数8
拉请求数12
打开的拉请求数2
关闭的拉请求数3
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?