jstream
is a streaming JSON parser and value extraction library for Go.
Unlike most JSON parsers, jstream
is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:
Using the below example document:
we can choose to extract and act only the objects within the top-level array:
f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
fmt.Printf("%v\n ", mv.Value)
}
output:
map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]
likewise, increasing depth level to 3
yields:
red
green
blue
cyan
magenta
yellow
black
optionally, kev:value pairs can be emitted as an individual struct:
decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}
Installing
go get github.com/bcicen/jstream
Commandline
jstream
comes with a cli tool for quick viewing of parsed values from JSON input:
cat input.json, jstream -v -d 1
depth start end type, value
1 004 069 object, {"colors":["red","green","blue"],"desc":"RGB"}
1 073 153 object, {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}
Options
Opt, Description
---, ---
-d <n>, emit values at depth n. if n < 0, all values will be emitted
-v, output depth and offset details for each value
-h, display help dialog
Benchmarks
Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.
Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)
input size, lib, MB/s, Allocated
---, ---, ---, ---
regular, standard, 97, 3.6MB
regular, jstream, 175, 2.1MB
large, standard, 92, 305MB
large, jstream, 404, 69MB
In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below: