jstream

Streaming JSON parser for Go

Github星跟蹤圖

GoDoc

jstream is a streaming JSON parser and value extraction library for Go.

Unlike most JSON parsers, jstream is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:

Using the below example document:

we can choose to extract and act only the objects within the top-level array:

f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
  fmt.Printf("%v\n ", mv.Value)
}

output:

map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]

likewise, increasing depth level to 3 yields:

red
green
blue
cyan
magenta
yellow
black

optionally, kev:value pairs can be emitted as an individual struct:

decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}

Installing

go get github.com/bcicen/jstream

Commandline

jstream comes with a cli tool for quick viewing of parsed values from JSON input:

cat input.json, jstream -v -d 1
depth	start	end	type, value

1	004	069	object, {"colors":["red","green","blue"],"desc":"RGB"}
1	073	153	object, {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}

Options

Opt, Description
---, ---
-d <n>, emit values at depth n. if n < 0, all values will be emitted
-v, output depth and offset details for each value
-h, display help dialog

Benchmarks

Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.

Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)

input size, lib, MB/s, Allocated
---, ---, ---, ---
regular, standard, 97, 3.6MB
regular, jstream, 175, 2.1MB
large, standard, 92, 305MB
large, jstream, 404, 69MB

In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below:

主要指標

概覽
名稱與所有者bcicen/jstream
主編程語言Go
編程語言Go (語言數: 1)
平台
許可證MIT License
所有者活动
創建於2018-06-25 19:39:12
推送於2023-10-23 07:21:37
最后一次提交2020-03-30 16:12:42
發布數2
最新版本名稱v1.0.1 (發布於 2020-08-13 11:52:00)
第一版名稱v1.0.0 (發布於 )
用户参与
星數586
關注者數15
派生數38
提交數41
已啟用問題?
問題數4
打開的問題數2
拉請求數6
打開的拉請求數2
關閉的拉請求數3
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?