scrapyd-client

Command line client for Scrapyd server

  • Owner: scrapy/scrapyd-client
  • Platform:
  • License:: BSD 3-Clause "New" or "Revised" License
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

==============
Scrapyd-client

.. image:: https://secure.travis-ci.org/scrapy/scrapyd-client.png?branch=master
:target: http://travis-ci.org/scrapy/scrapyd-client

Scrapyd-client is a client for Scrapyd_. It provides the general scrapyd-client and the
scrapyd-deploy utility which allows you to deploy your project to a Scrapyd server.

.. _Scrapyd: https://scrapyd.readthedocs.io

scrapyd-client

For a reference on each subcommand invoke scrapyd-client <subcommand> --help.

Where filtering with wildcards is possible, it is facilitated with fnmatch_.
The --project option can be omitted if one is found in a scrapy.cfg.

.. _fnmatch: https://docs.python.org/library/fnmatch.html

deploy


At the moment this is a wrapper around `scrapyd-deploy`_. Note that the command line options
of this one are likely to change.

projects

Lists all projects of a Scrapyd instance::

lists all projects on the default target

scrapyd-client projects

lists all projects from a custom URL

scrapyd-client -t http://scrapyd.example.net projects

schedule


Schedules one or more spiders to be executed::

   # schedules any spider
   scrapyd-client schedule
   # schedules all spiders from the 'knowledge' project
   scrapyd-client schedule -p knowledge \*
   # schedules any spider from any project whose name ends with '_daily'
   scrapyd-client schedule -p \* *_daily

spiders
~~~~~~~

Lists spiders of one or more projects::

   # lists all spiders
   scrapyd-client spiders
   # lists all spiders from the 'knowledge' project
   scrapyd-client spiders -p knowledge


scrapyd-deploy
--------------

How It Works

Deploying your project to a Scrapyd server typically involves two steps:

  1. Eggifying_ your project. You'll need to install setuptools_ for this. See Egg Caveats_ below.
  2. Uploading the egg to the Scrapyd server through the addversion.json_ endpoint.

The scrapyd-deploy tool automates the process of building the egg and pushing it to the target
Scrapyd server.

.. _addversion.json: https://scrapyd.readthedocs.org/en/latest/api.html#addversion-json
.. _Eggifying: http://peak.telecommunity.com/DevCenter/PythonEggs
.. _setuptools: https://pypi.python.org/pypi/setuptools

Deploying a Project


First ``cd`` into your project's root, you can then deploy your project with the following::

    scrapyd-deploy <target> -p <project>

This will eggify your project and upload it to the target. If you have a ``setup.py`` file in your
project, it will be used, otherwise one will be created automatically.

If successful you should see a JSON response similar to the following::

    Deploying myproject-1287453519 to http://localhost:6800/addversion.json
    Server response (200):
    {"status": "ok", "spiders": ["spider1", "spider2"]}

To save yourself from having to specify the target and project, you can set the defaults in the
``scrapy.cfg`` file::

    [deploy]
    url = http://scrapyd.example.com/api/scrapyd
    username = scrapy
    password = secret
    project = yourproject


You can now deploy your project with just the following::

    scrapyd-deploy

If you have more than one target to deploy, you can deploy your project in all targets with one
command::

      scrapyd-deploy -a -p <project>

Versioning
~~~~~~~~~~

By default, ``scrapyd-deploy`` uses the current timestamp for generating the project version, as
shown above. However, you can pass a custom version using ``--version``::

    scrapyd-deploy <target> -p <project> --version <version>

Or for all targets::

    scrapyd-deploy -a -p <project> --version <version>

The version must be comparable with LooseVersion_. Scrapyd will use the greatest version unless
specified.

If you use Mercurial or Git, you can use ``HG`` or ``GIT`` respectively as the argument supplied to
``--version`` to use the current revision as the version. You can save yourself having to specify
the version parameter by adding it to your target's entry in ``scrapy.cfg``::

    [deploy:target]
    ...
    version = HG

.. _LooseVersion: http://epydoc.sourceforge.net/stdlib/distutils.version.LooseVersion-class.html

Local Settings
~~~~~~~~~~~~~~

You may want to keep certain settings local and not have them deployed to Scrapyd. To accomplish
this you can create a ``local_settings.py`` file at the root of your project, where your
``scrapy.cfg`` file resides, and add the following to your project's settings::

    try:
        from local_settings import *
    except ImportError:
        pass

``scrapyd-deploy`` doesn't deploy anything outside of the project module, so the
``local_settings.py`` file won't be deployed.

Egg Caveats
~~~~~~~~~~~

Some things to keep in mind when building eggs for your Scrapy project:

* Make sure no local development settings are included in the egg when you build it. The
  ``find_packages`` function may be picking up your custom settings. In most cases you want to
  upload the egg with the default project settings.
* You should avoid using ``__file__`` in your project code as it doesn't play well with eggs.
  Consider using `pkgutil.get_data`_ instead.
* Be careful when writing to disk in your project, as Scrapyd will most likely be running under a
  different user which may not have write access to certain directories. If you can, avoid writing
  to disk and always use tempfile_ for temporary files.

.. _pkgutil.get_data: http://docs.python.org/library/pkgutil.html#pkgutil.get_data
.. _tempfile: http://docs.python.org/library/tempfile.html


Global settings
---------------

Targets
~~~~~~~

You can define Scrapyd targets in your project's ``scrapy.cfg`` file. Example::

    [deploy:example]
    url = http://scrapyd.example.com/api/scrapyd
    username = scrapy
    password = secret

While your target needs to be defined with its URL in ``scrapy.cfg``,
you can use netrc_ for username and password, like so::

    machine scrapyd.example.com
        username scrapy
        password secret

If you want to list all available targets, you can use the ``-l`` option::

    scrapyd-deploy -l

To list projects available on a specific target, use the ``-L`` option::

    scrapyd-deploy -L example

.. _netrc: https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html

Main metrics

Overview
Name With Ownerscrapy/scrapyd-client
Primary LanguagePython
Program languagePython (Language Count: 1)
Platform
License:BSD 3-Clause "New" or "Revised" License
所有者活动
Created At2015-03-27 19:06:59
Pushed At2025-05-22 20:29:27
Last Commit At2025-05-15 17:24:27
Release Count11
Last Release Name2.0.3 (Posted on 2025-05-22 16:29:19)
First Release Namev1.1.0 (Posted on )
用户参与
Stargazers Count775
Watchers Count37
Fork Count146
Commits Count259
Has Issues Enabled
Issues Count85
Issue Open Count5
Pull Requests Count33
Pull Requests Open Count0
Pull Requests Close Count17
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private