scrapinghub-image-casperjs

Recommended base Docker image for CasperJS spiders at Scrapinghub

  • Owner: scrapinghub/scrapinghub-image-casperjs
  • Platform:
  • License::
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

scrapinghub-image-casperjs

Recommended base Docker image for CasperJS spiders at Scrapinghub.

shub-exec

shub-exec is a utility that converts project and spider settings to environment variables and
job arguments to CasperJS command line options.

This tool can be used in start-crawl
script as last command that exec the real spider script, but before it setups environment from settings
in SHUB_SETTINGS environment variable and arguments according to SHUB_JOB_DATA environment variable.

An example start-crawl for CasperJS is:

#/bin/sh
exec shub-exec -- casperjs --debug /app/$SHUB_SPIDER

For a job whose spider name is simple.js and job arguments are url=http://scrapinghub.com it will run:

casperjs --debug /app/simple.js --url=http://scrapinghub.com

It's important to have -- before the positional arguments, it helps to distinguish shub-exec options
from script options.

If the job has some setting set, i.e. LOGLEVEL=DEBUG, it will be available in CasperJS
process environment as 'LOGLEVEL' with value 'DEBUG'.

Overview

Name With Ownerscrapinghub/scrapinghub-image-casperjs
Primary LanguagePython
Program languagePython (Language Count: 1)
Platform
License:
Release Count6
Last Release Name0.0.6 (Posted on )
First Release Name0.0.1 (Posted on )
Created At2017-05-05 19:09:39
Pushed At2017-06-07 17:52:52
Last Commit At2017-06-07 18:52:31
Stargazers Count0
Watchers Count3
Fork Count1
Commits Count19
Has Issues Enabled
Issues Count0
Issue Open Count0
Pull Requests Count4
Pull Requests Open Count0
Pull Requests Close Count0
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top