Rod

一个用于 web 自动化和爬虫的 Devtools 驱动程序。「A Devtools driver for web automation and scraping」

Github stars Tracking Chart

Overview

GoDoc
codecov
goreport
Gitter

logo

Rod is a High-level Chrome Devtools driver directly based on Chrome DevTools Protocol.
It's designed for web automation and scraping. Rod also tries to expose low-level interfaces to users, so that whenever a function is missing users can easily send control requests to the browser directly.

Features

  • Fluent interface design to reduce verbose code
  • Chained context design, intuitive to timeout or cancel the long-running task
  • Debugging friendly, auto input tracing, remote monitoring headless browser
  • Automatically find or download browser
  • No external dependencies, CI tested on Linux, Mac, and Windows
  • High-level helpers like WaitStable, WaitRequestIdle, GetDownloadFile, Resource
  • Two-step WaitEvent design, never miss an event
  • Correctly handles nested iframes
  • No zombie chrome process after the crash (how it works)

Examples

You can find examples from here or here.

For more detailed examples, please search the unit tests.
Such as the usage of method HandleAuth, search the all the *_test.go files that contain HandleAuth or HandleAuthE.
You can also search the GitHub issues, they contain a lot of usage examples too.

If you have questions, please raise an issue or join the gitter room.

How it works

Here's the common start process of Rod:

  1. Try to connect to a Chrome Devtools endpoint, if not found try to launch a local browser, if still not found try to download one, then connect again. The lib to handle it is here.

  2. Use the JSON-RPC to talk to the browser endpoint to control it. The client to handle it is here.

  3. The type definitions of the data transmitted via JSON-RPC are handled by this lib.

  4. To control a specific page, Rod will first inject a js helper script to it. Rod uses it to query and manipulate the page content. The js lib is here.

FAQ

Q: How to use Rod with docker

To let rod work with docker is very easy:

  1. Run the Rod image docker run -p 9222:9222 ysmood/rod

  2. Open another terminal and run a go program like this example

The Rod image
can dynamically launch a chrome for each remote driver with customizable chrome flags.
It's tuned for screenshots and fonts among popular natural languages.
You can easily load balance requests to the cluster of this image, each container can create multiple browser instances at the same time.

Q: Does it support other browsers like Firefox or Edge

Rod should work with any browser that supports Chrome DevTools Protocol.
For now, Firefox is supporting this protocol, and Edge will adopt chromium as their backend, so it seems like most major browsers will support it in the future except for Safari.

Q: Why is it called Rod

Rod is related to puppetry, see Rod Puppet.
So we are the puppeteer, Chrome is the puppet, we use the rod to control the puppet.
So in this sense, puppeteer.js sounds strange, we are controlling a puppeteer?

Q: How to contribute

Please check this doc.

Q: How versioning is handled

Semver is used.

Before v1.0.0 whenever the second section changed, such as v0.1.0 to v0.2.0, there must be some public API changes, such as changes of function names or parameter types. If only the last section changed, no public API will be changed.

Q: Why another puppeteer like lib

There are a lot of great projects, but no one is perfect, choose the best one that fits your needs is important.

  • selenium

    It's slower by design because it encourages the use of hard-coded sleep. When work with Rod, you generally don't use sleep at all.
    Therefore it's more buggy to use selenium if the network is unstable.
    It's harder to setup and maintain because of extra dependencies like a browser driver.

  • puppeteer

    With Puppeteer, you have to handle promise/async/await a lot. It requires a deep understanding of how promises works which are usually painful for QA to write automation tests. End to end tests usually requires a lot of sync operations to simulate human inputs, because Puppeteer is based on Nodejs all control signals it sends to chrome will be async calls, so it's unfriendly for QA from the beginning.

  • chromedp

    With Chromedp, you have to use their verbose DSL like tasks to handle the main logic, because Chromedp uses several wrappers to handle execution with context and options which makes it very hard to understand their code when bugs happen. The DSL like wrapper also make the Go type useless when tracking issues.

    It's painful to use Chromedp to deal with iframes, this ticket is still open after years.

    When a crash happens, Chromedp will leave the zombie chrome process on Windows and Mac.

  • cypress

    Cypress is very limited, for closed shadow dom or cross-domain iframes it's almost unusable. Read their limitation doc for more details.

Overview

Name With Ownergrid-js/gridjs
Primary LanguageTypeScript
Program languageGo (Language Count: 5)
Platform
License:MIT License
Release Count60
Last Release Name6.2.0 (Posted on )
First Release Name0.1.11 (Posted on )
Created At2020-04-15 17:41:40
Pushed At2024-03-03 17:04:55
Last Commit At2024-03-03 17:03:28
Stargazers Count4.3k
Watchers Count36
Fork Count233
Commits Count1.5k
Has Issues Enabled
Issues Count339
Issue Open Count75
Pull Requests Count530
Pull Requests Open Count26
Pull Requests Close Count469
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top