flatson

Tool to flatten stream of JSON-like objects, configured via schema

  • 所有者: scrapinghub/flatson
  • 平台:
  • 许可证: BSD 3-Clause "New" or "Revised" License
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

===============================
Flatson

.. image:: https://img.shields.io/travis/scrapinghub/flatson.svg
:target: https://travis-ci.org/scrapinghub/flatson

.. image:: https://img.shields.io/pypi/v/flatson.svg
:target: https://pypi.python.org/pypi/flatson

Flatson emerged at Scrapinghub_ from the need to export huge JSON-like datasets into flat CSV-like tables. Flatson is particularly useful to handle really huge datasets, because it doesn't load all the data in memory at once.

.. _Scrapinghub: http://scrapinghub.com

Features

  • Flattens Python dictionaries using a JSON schema
  • Supports per-field configuration via the schema

Usage::

>>> from flatson import Flatson
>>> schema = {
        "$schema": "http://json-schema.org/draft-04/schema",
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "number"},
            "address": {
                "type": "object",
                "properties": {"city": {"type": "string"}, "street": {"type": "string"}}
            },
            "skills": {"type": "array", "items": {"type": "string"}}
        }
    }
>>> sample = {
            "name": "Claudio", "age": 42,
            "address": {"city": "Paris", "street": "Rue de Sevres"},
            "skills": ["hacking", "soccer"]}
>>> f = Flatson(schema)
>>> f.fieldnames
['address.city', 'address.street', 'age', 'name', 'skills']
>>> f.flatten(sample)
['Paris', 'Rue de Sevres', 42, 'Claudio', '["hacking","soccer"]']

You can get a dict with the field names order preserved::

>>> f.flatten_dict(sample)
OrderedDict([('address.city', 'Paris'), ('address.street', 'Rue de Sevres'), ('age', 42), ('name', 'Claudio'), ('skills', '["hacking","soccer"]')])

You can also configure array serialization behavior through the schema (default JSON)::

>>> schema = {
        "$schema": "http://json-schema.org/draft-04/schema",
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "skills": {
                "type": "array",
                "items": {"type": "string"},
                "flatson_serialize": {"method": "join_values"},
            }
        }
    }
>>> f = Flatson(schema)
>>> f.flatten({"name": "Salazar", "skills": ["hacking", "socker", "partying"]})
['Salazar', 'hacking,socker,partying']

主要指标

概览
名称与所有者scrapinghub/flatson
主编程语言Python
编程语言Makefile (语言数: 2)
平台
许可证BSD 3-Clause "New" or "Revised" License
所有者活动
创建于2015-07-10 23:59:51
推送于2019-10-19 12:03:08
最后一次提交2016-06-09 10:19:56
发布数1
最新版本名称v0.1.0 (发布于 )
第一版名称v0.1.0 (发布于 )
用户参与
星数33
关注者数111
派生数6
提交数36
已启用问题?
问题数4
打开的问题数4
拉请求数4
打开的拉请求数4
关闭的拉请求数2
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?