ants-go

open source, distributed, restful crawler engine in golang

  • 所有者: wcong/ants-go
  • 平台:
  • 許可證: MIT License
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

ants-go

open source, restful, distributed crawler engine

gitter

Join the chat at https://gitter.im/wcong/ants-go

comming up

  • Persistence
  • Dynamic Master

design of ants-go

ants

I wrote a crawler engine named ants in python base on scrapy. But sometimes, dynamic language is chaos.
So I start to write it in a compile language.

scrapy

I design the crawler framework by imitating scrapy.
such as downloader,scraper,and the way user write customize spider,
but in a compile way

elasticsearch

I design my distributed architecture by imitating elasticsearch.
it spire me to do a engine for distributed crawler

requirement

go get github.com/PuerkitoBio/goquery
go get github.com/go-sql-driver/mysql

install

go get github.com/wcong/ants-go
go install github.com/wcong/ants-go

run

cd bin
./ants-go

check cluster status

curl 'http://localhost:8200/cluster'

get all spiders

curl 'http://localhost:8200/spiders'

start a spider

curl 'http://localhost:8200/crawl?spider=spiderName'

cluster in one computer

to test cluster in one computer,you can run it from different port in different terminal

one node,use the default port tcp 8300 http 8200

cd bin
./ants-go

the other node set tcp port and http port

cd bin
./ants-go -tcp 9300 -http 9200

flags

there are some flags you can set,check out the help message

./ants-go -h
./ants-go -help

Customize spider

  1. go to spiders
  2. write your spiders follow the example deap_loop_spider.go or go to the spider page
  3. add you spider to spiderMap,follow the example in LoadAllSpiders in load_all_spider.go
  4. install again

主要指標

概覽
名稱與所有者wcong/ants-go
主編程語言Go
編程語言Go (語言數: 1)
平台
許可證MIT License
所有者活动
創建於2015-02-02 07:21:03
推送於2016-04-05 05:03:38
最后一次提交2016-04-05 13:03:25
發布數0
用户参与
星數362
關注者數44
派生數118
提交數80
已啟用問題?
問題數18
打開的問題數4
拉請求數1
打開的拉請求數0
關閉的拉請求數0
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?