jstream

Streaming JSON parser for Go

Github星跟踪图

GoDoc

jstream is a streaming JSON parser and value extraction library for Go.

Unlike most JSON parsers, jstream is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:

Using the below example document:

we can choose to extract and act only the objects within the top-level array:

f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
  fmt.Printf("%v\n ", mv.Value)
}

output:

map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]

likewise, increasing depth level to 3 yields:

red
green
blue
cyan
magenta
yellow
black

optionally, kev:value pairs can be emitted as an individual struct:

decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}

Installing

go get github.com/bcicen/jstream

Commandline

jstream comes with a cli tool for quick viewing of parsed values from JSON input:

cat input.json, jstream -v -d 1
depth	start	end	type, value

1	004	069	object, {"colors":["red","green","blue"],"desc":"RGB"}
1	073	153	object, {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}

Options

Opt, Description
---, ---
-d <n>, emit values at depth n. if n < 0, all values will be emitted
-v, output depth and offset details for each value
-h, display help dialog

Benchmarks

Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.

Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)

input size, lib, MB/s, Allocated
---, ---, ---, ---
regular, standard, 97, 3.6MB
regular, jstream, 175, 2.1MB
large, standard, 92, 305MB
large, jstream, 404, 69MB

In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below:

主要指标

概览
名称与所有者bcicen/jstream
主编程语言Go
编程语言Go (语言数: 1)
平台
许可证MIT License
所有者活动
创建于2018-06-25 19:39:12
推送于2023-10-23 07:21:37
最后一次提交2020-03-30 16:12:42
发布数2
最新版本名称v1.0.1 (发布于 2020-08-13 11:52:00)
第一版名称v1.0.0 (发布于 )
用户参与
星数585
关注者数15
派生数38
提交数41
已启用问题?
问题数4
打开的问题数2
拉请求数6
打开的拉请求数2
关闭的拉请求数3
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?