flatson

Tool to flatten stream of JSON-like objects, configured via schema

  • 所有者: scrapinghub/flatson
  • 平台:
  • 許可證: BSD 3-Clause "New" or "Revised" License
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

===============================
Flatson

.. image:: https://img.shields.io/travis/scrapinghub/flatson.svg
:target: https://travis-ci.org/scrapinghub/flatson

.. image:: https://img.shields.io/pypi/v/flatson.svg
:target: https://pypi.python.org/pypi/flatson

Flatson emerged at Scrapinghub_ from the need to export huge JSON-like datasets into flat CSV-like tables. Flatson is particularly useful to handle really huge datasets, because it doesn't load all the data in memory at once.

.. _Scrapinghub: http://scrapinghub.com

Features

  • Flattens Python dictionaries using a JSON schema
  • Supports per-field configuration via the schema

Usage::

>>> from flatson import Flatson
>>> schema = {
        "$schema": "http://json-schema.org/draft-04/schema",
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "number"},
            "address": {
                "type": "object",
                "properties": {"city": {"type": "string"}, "street": {"type": "string"}}
            },
            "skills": {"type": "array", "items": {"type": "string"}}
        }
    }
>>> sample = {
            "name": "Claudio", "age": 42,
            "address": {"city": "Paris", "street": "Rue de Sevres"},
            "skills": ["hacking", "soccer"]}
>>> f = Flatson(schema)
>>> f.fieldnames
['address.city', 'address.street', 'age', 'name', 'skills']
>>> f.flatten(sample)
['Paris', 'Rue de Sevres', 42, 'Claudio', '["hacking","soccer"]']

You can get a dict with the field names order preserved::

>>> f.flatten_dict(sample)
OrderedDict([('address.city', 'Paris'), ('address.street', 'Rue de Sevres'), ('age', 42), ('name', 'Claudio'), ('skills', '["hacking","soccer"]')])

You can also configure array serialization behavior through the schema (default JSON)::

>>> schema = {
        "$schema": "http://json-schema.org/draft-04/schema",
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "skills": {
                "type": "array",
                "items": {"type": "string"},
                "flatson_serialize": {"method": "join_values"},
            }
        }
    }
>>> f = Flatson(schema)
>>> f.flatten({"name": "Salazar", "skills": ["hacking", "socker", "partying"]})
['Salazar', 'hacking,socker,partying']

主要指標

概覽
名稱與所有者scrapinghub/flatson
主編程語言Python
編程語言Makefile (語言數: 2)
平台
許可證BSD 3-Clause "New" or "Revised" License
所有者活动
創建於2015-07-10 23:59:51
推送於2019-10-19 12:03:08
最后一次提交2016-06-09 10:19:56
發布數1
最新版本名稱v0.1.0 (發布於 )
第一版名稱v0.1.0 (發布於 )
用户参与
星數33
關注者數111
派生數6
提交數36
已啟用問題?
問題數4
打開的問題數4
拉請求數4
打開的拉請求數4
關閉的拉請求數2
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?