Colly

优雅的 Golang Scraper 和爬虫框架。「Elegant Scraper and Crawler Framework for Golang

Github stars Tracking Chart

Colly

Lightning Fast and Elegant Scraping Framework for Gophers

Colly provides a clean interface to write any kind of crawler/scraper/spider.

With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

GoDoc
Backers on Open Collective Sponsors on Open Collective build status
report card
view examples
Code Coverage
FOSSA Status
Twitter URL

Features

  • Clean API
  • Fast (>1k request/sec on a single core)
  • Manages request delays and maximum concurrency per domain
  • Automatic cookie and session handling
  • Sync/async/parallel scraping
  • Caching
  • Automatic encoding of non-unicode responses
  • Robots.txt support
  • Distributed scraping
  • Configuration via environment variables
  • Extensions

Example

func main() {
	c := colly.NewCollector()

	// Find and visit all links
	c.OnHTML("a[href]", func(e *colly.HTMLElement) {
		e.Request.Visit(e.Attr("href"))
	})

	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL)
	})

	c.Visit("http://go-colly.org/")
}

See examples folder for more detailed examples.

Installation

go get -u github.com/gocolly/colly/v2/...

Bugs

Bugs or suggestions? Visit the issue tracker or join #colly on freenode

Other Projects Using Colly

Below is a list of public, open source projects that use Colly:

If you are using Colly in a project please send a pull request to add it to the list.

Contributors

This project exists thanks to all the people who contribute. (CONTRIBUTING.md).

Backers

Thank you to all our backers! ? [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]










License

FOSSA Status

Main metrics

Overview
Name With Ownergocolly/colly
Primary LanguageGo
Program languageGo (Language Count: 2)
PlatformLinux, Mac, Windows
License:Apache License 2.0
所有者活动
Created At2017-09-29 14:08:49
Pushed At2025-03-28 16:07:32
Last Commit At2025-03-28 17:07:32
Release Count7
Last Release Namev2.2.0 (Posted on 2025-03-27 11:42:17)
First Release Namev1.0.0 (Posted on 2018-05-13 00:44:45)
用户参与
Stargazers Count24.1k
Watchers Count327
Fork Count1.8k
Commits Count685
Has Issues Enabled
Issues Count558
Issue Open Count149
Pull Requests Count172
Pull Requests Open Count46
Pull Requests Close Count64
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private