scraper

HTML parsing and querying with CSS selectors

  • Owner: rust-scraper/scraper
  • Platform:
  • License:: ISC License
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

This project is looking for maintainer(s): #36


scraper

HTML parsing and querying with CSS selectors.

scraper is on Crates.io and GitHub.

Scraper provides an interface to Servo's html5ever and selectors crates, for browser-grade
parsing and querying.

Examples

Parsing a document

use scraper::Html;

let html = r#"
    <!DOCTYPE html>
    <meta charset="utf-8">
    <title>Hello, world!</title>
    <h1 class="foo">Hello, <i>world!</i></h1>
"#;

let document = Html::parse_document(html);

Parsing a fragment

use scraper::Html;
let fragment = Html::parse_fragment("<h1>Hello, <i>world!</i></h1>");

Parsing a selector

use scraper::Selector;
let selector = Selector::parse("h1.foo").unwrap();

Selecting elements

use scraper::{Html, Selector};

let html = r#"
    <ul>
        <li>Foo</li>
        <li>Bar</li>
        <li>Baz</li>
    </ul>
"#;

let fragment = Html::parse_fragment(html);
let selector = Selector::parse("li").unwrap();

for element in fragment.select(&selector) {
    assert_eq!("li", element.value().name());
}

Selecting descendent elements

use scraper::{Html, Selector};

let html = r#"
    <ul>
        <li>Foo</li>
        <li>Bar</li>
        <li>Baz</li>
    </ul>
"#;

let fragment = Html::parse_fragment(html);
let ul_selector = Selector::parse("ul").unwrap();
let li_selector = Selector::parse("li").unwrap();

let ul = fragment.select(&ul_selector).next().unwrap();
for element in ul.select(&li_selector) {
    assert_eq!("li", element.value().name());
}

Accessing element attributes

use scraper::{Html, Selector};

let fragment = Html::parse_fragment(r#"<input name="foo" value="bar">"#);
let selector = Selector::parse(r#"input[name="foo"]"#).unwrap();

let input = fragment.select(&selector).next().unwrap();
assert_eq!(Some("bar"), input.value().attr("value"));

Serializing HTML and inner HTML

use scraper::{Html, Selector};

let fragment = Html::parse_fragment("<h1>Hello, <i>world!</i></h1>");
let selector = Selector::parse("h1").unwrap();

let h1 = fragment.select(&selector).next().unwrap();

assert_eq!("<h1>Hello, <i>world!</i></h1>", h1.html());
assert_eq!("Hello, <i>world!</i>", h1.inner_html());

Accessing descendent text

use scraper::{Html, Selector};

let fragment = Html::parse_fragment("<h1>Hello, <i>world!</i></h1>");
let selector = Selector::parse("h1").unwrap();

let h1 = fragment.select(&selector).next().unwrap();
let text = h1.text().collect::<Vec<_>>();

assert_eq!(vec!["Hello, ", "world!"], text);

License: ISC

Main metrics

Overview
Name With Ownerrust-scraper/scraper
Primary LanguageRust
Program languageRust (Language Count: 2)
Platform
License:ISC License
所有者活动
Created At2016-01-01 21:45:09
Pushed At2025-06-11 20:27:00
Last Commit At2025-06-11 22:27:00
Release Count34
Last Release Namev0.23.1 (Posted on )
First Release Namev0.1.0 (Posted on )
用户参与
Stargazers Count2.1k
Watchers Count20
Fork Count113
Commits Count412
Has Issues Enabled
Issues Count116
Issue Open Count10
Pull Requests Count96
Pull Requests Open Count0
Pull Requests Close Count33
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private