scrapy-monkeylearn

A Scrapy pipeline to categorize items using MonkeyLearn

  • Owner: scrapy-plugins/scrapy-monkeylearn
  • Platform:
  • License:: MIT License
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

scrapy-monkeylearn

A Scrapy_ pipeline to categorize items using MonkeyLearn_.

Settings

MONKEYLEARN_BATCH_SIZE


The size of the item batches sent to MonkeyLearn.

Default: ``200``

Example:

.. code-block:: python

   MONKEYLEARN_BATCH_SIZE = 200

MONKEYLEARN_MODULE
~~~~~~~~~~~~~~~~~~

The ID of the monkeylearn module.

Example:

.. code-block:: python

    MONKEYLEARN_MODULE = 'cl_oFKL5wft'

MONKEYLEARN_USE_SANDBOX

In case of using a classifier, if the sandbox version should be used.

Default: False

Example:

.. code-block:: python

MONKEYLEARN_USE_SANDBOX = True

MONKEYLEARN_TOKEN


The auth token.

Example:

.. code-block:: python

    MONKEYLEARN_TOKEN = 'TWFuIGlzIGRp...'

MONKEYLEARN_FIELD_TO_PROCESS

A field or list of Item text fields to use for classification.
Also comma-separated string with field names is supported.

Example:

.. code-block:: python

MONKEYLEARN_FIELD_TO_PROCESS = 'title'

.. code-block:: python

MONKEYLEARN_FIELD_TO_PROCESS = ['title', 'description']

.. code-block:: python

MONKEYLEARN_FIELD_TO_PROCESS = 'title,description'

MONKEYLEARN_FIELD_OUTPUT


The field where the MonkeyLearn output will be stored.

Example:

.. code-block:: python

    MONKEYLEARN_FIELD_OUTPUT = 'categories'


An example value of the `MONKEYLEARN_FIELD_OUTPUT` field after classification is:

.. code-block:: python

    [{'label': 'English', 'probability': 0.321}]

Usage
-----

In your *settings.py* file, add the previously described settings and add ``MonkeyLearnPipeline`` to your pipelines, e.g.:

.. code-block:: python

    ITEM_PIPELINES = {
        'scrapy_monkeylearn.pipelines.MonkeyLearnPipeline': 100,
    }

License
-------

Copyright (c) 2015 `MonkeyLearn`_.

Released under the MIT license.

.. _Scrapy: http://scrapy.org/
.. _MonkeyLearn: http://www.monkeylearn.com/

Overview

Name With Ownerscrapy-plugins/scrapy-monkeylearn
Primary LanguagePython
Program languagePython (Language Count: 1)
Platform
License:MIT License
Release Count0
Created At2015-03-03 15:00:34
Pushed At2017-04-28 12:51:05
Last Commit At2017-04-28 13:51:02
Stargazers Count37
Watchers Count6
Fork Count13
Commits Count91
Has Issues Enabled
Issues Count3
Issue Open Count0
Pull Requests Count9
Pull Requests Open Count0
Pull Requests Close Count2
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top