scrapy-monkeylearn

A Scrapy pipeline to categorize items using MonkeyLearn

  • 所有者: scrapy-plugins/scrapy-monkeylearn
  • 平台:
  • 許可證:
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

scrapy-monkeylearn

A Scrapy_ pipeline to categorize items using MonkeyLearn_.

Settings

MONKEYLEARN_BATCH_SIZE


The size of the item batches sent to MonkeyLearn.

Default: ``200``

Example:

.. code-block:: python

   MONKEYLEARN_BATCH_SIZE = 200

MONKEYLEARN_MODULE
~~~~~~~~~~~~~~~~~~

The ID of the monkeylearn module.

Example:

.. code-block:: python

    MONKEYLEARN_MODULE = 'cl_oFKL5wft'

MONKEYLEARN_USE_SANDBOX

In case of using a classifier, if the sandbox version should be used.

Default: False

Example:

.. code-block:: python

MONKEYLEARN_USE_SANDBOX = True

MONKEYLEARN_TOKEN


The auth token.

Example:

.. code-block:: python

    MONKEYLEARN_TOKEN = 'TWFuIGlzIGRp...'

MONKEYLEARN_FIELD_TO_PROCESS

A field or list of Item text fields to use for classification.
Also comma-separated string with field names is supported.

Example:

.. code-block:: python

MONKEYLEARN_FIELD_TO_PROCESS = 'title'

.. code-block:: python

MONKEYLEARN_FIELD_TO_PROCESS = ['title', 'description']

.. code-block:: python

MONKEYLEARN_FIELD_TO_PROCESS = 'title,description'

MONKEYLEARN_FIELD_OUTPUT


The field where the MonkeyLearn output will be stored.

Example:

.. code-block:: python

    MONKEYLEARN_FIELD_OUTPUT = 'categories'


An example value of the `MONKEYLEARN_FIELD_OUTPUT` field after classification is:

.. code-block:: python

    [{'label': 'English', 'probability': 0.321}]

Usage
-----

In your *settings.py* file, add the previously described settings and add ``MonkeyLearnPipeline`` to your pipelines, e.g.:

.. code-block:: python

    ITEM_PIPELINES = {
        'scrapy_monkeylearn.pipelines.MonkeyLearnPipeline': 100,
    }

License
-------

Copyright (c) 2015 `MonkeyLearn`_.

Released under the MIT license.

.. _Scrapy: http://scrapy.org/
.. _MonkeyLearn: http://www.monkeylearn.com/

概覽

名稱與所有者scrapy-plugins/scrapy-monkeylearn
主編程語言Python
編程語言Python (語言數: 1)
平台
許可證
發布數0
創建於2015-03-03 15:00:34
推送於2017-04-28 12:51:05
最后一次提交2017-04-28 13:51:02
星數37
關注者數6
派生數13
提交數91
已啟用問題?
問題數3
打開的問題數0
拉請求數9
打開的拉請求數0
關閉的拉請求數2
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?
去到頂部