html5ever

High-performance browser-grade HTML5 parser

  • 所有者: servo/html5ever
  • 平台:
  • 许可证: Other
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

html5ever

Build Status
crates.io

API Documentation

html5ever is an HTML parser developed as part of the Servo project.

It can parse and serialize HTML according to the WHATWG specs (aka "HTML5"). There are some omissions at present, most of which are documented in the bug tracker. html5ever passes all tokenizer tests from html5lib-tests, and most tree builder tests outside of the unimplemented features. The goal is to pass all html5lib tests, and also provide all hooks needed by a production web browser, e.g. document.write.

Note that the HTML syntax is a language almost, but not quite, entirely unlike XML. For correct parsing of XHTML, use an XML parser. (That said, many XHTML documents in the wild are serialized in an HTML-compatible form.)

html5ever is written in Rust, so it avoids the most notorious security problems from C, but has performance similar to a parser written in C. You can call html5ever as if it were a C library, without pulling in a garbage collector or other heavy runtime requirements.

Getting started in Rust

Add html5ever as a dependency in your Cargo.toml file:

[dependencies]
html5ever = "*"

Then take a look at examples/html2html.rs and examples/print-rcdom.rs and the API documentation.

Getting started in other languages

Bindings for Python and other languages are much desired.

Working on html5ever

To fetch the test suite, you need to run

git submodule update --init

Run cargo doc in the repository root to build local documentation under target/doc/.

Details

html5ever uses callbacks to manipulate the DOM, and does not provide any DOM tree representation.

html5ever exclusively uses UTF-8 to represent strings. In the future it will support other document encodings (and UCS-2 document.write) by converting input.

The code is cross-referenced with the WHATWG syntax spec, and eventually we will have a way to present code and spec side-by-side.

html5ever builds against the official stable releases of Rust, though some optimizations are only supported on nightly releases.

主要指标

概览
名称与所有者servo/html5ever
主编程语言Rust
编程语言HTML (语言数: 3)
平台
许可证Other
所有者活动
创建于2014-03-13 02:04:18
推送于2025-06-20 17:32:46
最后一次提交
发布数73
最新版本名称web_atoms-v0.1.3 (发布于 )
第一版名称v0.1.0 (发布于 )
用户参与
星数2.3k
关注者数42
派生数237
提交数1.4k
已启用问题?
问题数201
打开的问题数53
拉请求数357
打开的拉请求数8
关闭的拉请求数54
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?