scrapinghub-image-casperjs

Recommended base Docker image for CasperJS spiders at Scrapinghub

  • 所有者: scrapinghub/scrapinghub-image-casperjs
  • 平台:
  • 许可证:
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

scrapinghub-image-casperjs

Recommended base Docker image for CasperJS spiders at Scrapinghub.

shub-exec

shub-exec is a utility that converts project and spider settings to environment variables and
job arguments to CasperJS command line options.

This tool can be used in start-crawl
script as last command that exec the real spider script, but before it setups environment from settings
in SHUB_SETTINGS environment variable and arguments according to SHUB_JOB_DATA environment variable.

An example start-crawl for CasperJS is:

#/bin/sh
exec shub-exec -- casperjs --debug /app/$SHUB_SPIDER

For a job whose spider name is simple.js and job arguments are url=http://scrapinghub.com it will run:

casperjs --debug /app/simple.js --url=http://scrapinghub.com

It's important to have -- before the positional arguments, it helps to distinguish shub-exec options
from script options.

If the job has some setting set, i.e. LOGLEVEL=DEBUG, it will be available in CasperJS
process environment as 'LOGLEVEL' with value 'DEBUG'.

概览

名称与所有者scrapinghub/scrapinghub-image-casperjs
主编程语言Python
编程语言Python (语言数: 1)
平台
许可证
发布数6
最新版本名称0.0.6 (发布于 )
第一版名称0.0.1 (发布于 )
创建于2017-05-05 19:09:39
推送于2017-06-07 17:52:52
最后一次提交2017-06-07 18:52:31
星数0
关注者数3
派生数1
提交数19
已启用问题?
问题数0
打开的问题数0
拉请求数4
打开的拉请求数0
关闭的拉请求数0
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?
去到顶部