Colly

优雅的 Golang Scraper 和爬虫框架。「Elegant Scraper and Crawler Framework for Golang

Github星跟蹤圖

Colly

Lightning Fast and Elegant Scraping Framework for Gophers

Colly provides a clean interface to write any kind of crawler/scraper/spider.

With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

GoDoc
Backers on Open Collective Sponsors on Open Collective build status
report card
view examples
Code Coverage
FOSSA Status
Twitter URL

Features

  • Clean API
  • Fast (>1k request/sec on a single core)
  • Manages request delays and maximum concurrency per domain
  • Automatic cookie and session handling
  • Sync/async/parallel scraping
  • Caching
  • Automatic encoding of non-unicode responses
  • Robots.txt support
  • Distributed scraping
  • Configuration via environment variables
  • Extensions

Example

func main() {
	c := colly.NewCollector()

	// Find and visit all links
	c.OnHTML("a[href]", func(e *colly.HTMLElement) {
		e.Request.Visit(e.Attr("href"))
	})

	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL)
	})

	c.Visit("http://go-colly.org/")
}

See examples folder for more detailed examples.

Installation

go get -u github.com/gocolly/colly/v2/...

Bugs

Bugs or suggestions? Visit the issue tracker or join #colly on freenode

Other Projects Using Colly

Below is a list of public, open source projects that use Colly:

If you are using Colly in a project please send a pull request to add it to the list.

Contributors

This project exists thanks to all the people who contribute. (CONTRIBUTING.md).

Backers

Thank you to all our backers! ? [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]










License

FOSSA Status

主要指標

概覽
名稱與所有者gocolly/colly
主編程語言Go
編程語言Go (語言數: 2)
平台Linux, Mac, Windows
許可證Apache License 2.0
所有者活动
創建於2017-09-29 14:08:49
推送於2025-06-17 07:44:45
最后一次提交2025-06-17 09:44:45
發布數7
最新版本名稱v2.2.0 (發布於 2025-03-27 11:42:17)
第一版名稱v1.0.0 (發布於 2018-05-13 00:44:45)
用户参与
星數24.3k
關注者數325
派生數1.8k
提交數724
已啟用問題?
問題數560
打開的問題數148
拉請求數182
打開的拉請求數44
關閉的拉請求數67
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?