sajson

Lightweight, extremely high-performance JSON parser for C++11

  • Owner: chadaustin/sajson
  • Platform:
  • License:: MIT License
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

sajson

sajson is an extremely high-performance, in-place, DOM-style JSON parser written in C++.

Originally, sajson meant Single Allocation JSON, but it now supports dynamic allocation too.

Features

sajson parses an input document into a contiguous AST structure. Unlike some other high-performance JSON parsers, the AST is efficiently queryable. Object lookups by key are O(lg N) and array indexing is O(1).

sajson does not require that the input buffer is null-terminated. You can use it to parse straight out of a disk mmap or network buffer, for example.

sajson is in-situ: it modifies the input string. While parsing, string values are converted to UTF-8.

(Note: sajson pays a slight performance penalty for not requiring null termination of the input string. Because sajson is in-situ, many uses cases require copying the input data anyway. Therefore, I could be convinced to add an option for requiring null termination.)

Other Features

  • Single header file -- simply drop sajson.h into your project.
  • No exceptions, RTTI, or longjmp.
  • O(1) stack usage. No document will overflow the stack.
  • Only two number types: 32-bits and doubles.
  • Small code size -- suitable for Emscripten.
  • Has been fuzzed with American Fuzzy Lop.

AST Structure

The parsed AST's size is computed as such:

  • 2 words per string
  • 1 word per 32-bit integer value
  • 64 bits per floating point value
  • 1+N words per array, where N is the number of elements
  • 1+3N words per object, where N is the number of members

The values null, true, and false are encoded in tag bits and have no cost otherwise.

Allocation Modes

Single

The original sajson allocation mode allocates one word per byte of the input document. This is the fastest mode: because the AST and parse stack are guaranteed to fit, no allocation checks are required at runtime.

That is, on 32-bit platforms, sajson allocates 4 bytes per input character. On 64-bit platforms, sajson allocates 8 bytes per input character. Only use this parse mode if you can handle allocating the worst-case buffer size for your input documents.

Dynamic

The dynamic allocation mode grows the parse stack and AST buffer as needed. It's about 10-40% slower than single allocation because it needs to check for out-of-memory every time data is appended, and occasionally the buffers need to be reallocated and copied.

Bounded

The bounded allocation mode takes a fixed-size memory buffer and uses it for both
the parse stack and the resulting AST. If the parse stack and AST fit in the given
buffer, the parse succeeds. This allocation mode allows using sajson without
the library making any allocations.

Performance

sajson's performance is excellent - it frequently benchmarks faster than RapidJSON, for example.

Implementation details are available at http://chadaustin.me/tag/sajson/.

Documentation

API documentation is available at http://chadaustin.github.io/sajson/doxygen/

Downsides / Missing Features

  • sajson does not support UTF-16 or UTF-32. However, I have never seen one of those in the wild, so I suspect they may be a case of aggressive overspecification. Some JSON specifications indicate that UTF-8 is the only valid encoding. Either way, just transcode to UTF-8 first.

  • No support for 64-bit integers. If this is something you want, just ask. (There's little technical reason not to support it. It's just that most people avoid 64-bit integers in JSON because JavaScript can't read them.)

  • Requires C++11. Some of the ownership semantics were awkward to express in C++03.

Main metrics

Overview
Name With Ownerchadaustin/sajson
Primary LanguageC++
Program languagePython (Language Count: 7)
Platform
License:MIT License
所有者活动
Created At2012-12-30 19:21:41
Pushed At2025-04-06 17:40:49
Last Commit At2025-04-06 10:40:49
Release Count0
用户参与
Stargazers Count569
Watchers Count28
Fork Count44
Commits Count275
Has Issues Enabled
Issues Count32
Issue Open Count8
Pull Requests Count22
Pull Requests Open Count1
Pull Requests Close Count2
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private