jstream

Streaming JSON parser for Go

Github stars Tracking Chart

GoDoc

jstream is a streaming JSON parser and value extraction library for Go.

Unlike most JSON parsers, jstream is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:

Using the below example document:

we can choose to extract and act only the objects within the top-level array:

f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
  fmt.Printf("%v\n ", mv.Value)
}

output:

map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]

likewise, increasing depth level to 3 yields:

red
green
blue
cyan
magenta
yellow
black

optionally, kev:value pairs can be emitted as an individual struct:

decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}

Installing

go get github.com/bcicen/jstream

Commandline

jstream comes with a cli tool for quick viewing of parsed values from JSON input:

cat input.json, jstream -v -d 1
depth	start	end	type, value

1	004	069	object, {"colors":["red","green","blue"],"desc":"RGB"}
1	073	153	object, {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}

Options

Opt, Description
---, ---
-d <n>, emit values at depth n. if n < 0, all values will be emitted
-v, output depth and offset details for each value
-h, display help dialog

Benchmarks

Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.

Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)

input size, lib, MB/s, Allocated
---, ---, ---, ---
regular, standard, 97, 3.6MB
regular, jstream, 175, 2.1MB
large, standard, 92, 305MB
large, jstream, 404, 69MB

In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below:

Main metrics

Overview
Name With Ownerbcicen/jstream
Primary LanguageGo
Program languageGo (Language Count: 1)
Platform
License:MIT License
所有者活动
Created At2018-06-25 19:39:12
Pushed At2023-10-23 07:21:37
Last Commit At2020-03-30 16:12:42
Release Count2
Last Release Namev1.0.1 (Posted on 2020-08-13 11:52:00)
First Release Namev1.0.0 (Posted on )
用户参与
Stargazers Count585
Watchers Count15
Fork Count38
Commits Count41
Has Issues Enabled
Issues Count4
Issue Open Count2
Pull Requests Count6
Pull Requests Open Count2
Pull Requests Close Count3
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private