Microdata
Microdata is a package for the Go programming language to extract HTML Microdata from HTML5 documents. It depends on the golang.org/x/net/html HTML5-compliant parser.
HTML Microdata is a markup specification often used in combination with the schema collection to make it easier for search engines to identify and understand content on web pages. One of the most common schema is the rating you see when you google for something. Other schemas are persons, places, events, products, etc.
Installation
Single binaries for Linux, macOS and Windows are available on the release page.
Or build from source:
$ go get -u github.com/namsral/microdata/cmd/microdata
Usage
Parse an URL:
$ microdata https://www.gog.com/game/...
{
"items": [
{
"type": [
"http://schema.org/Product"
],
"properties": {
"additionalProperty": [
{
"type": [
"http://schema.org/PropertyValue"
],
{
...
Parse HTML from the stdin:
$ cat saved.html, microdata
Format the output with a Go template to return the "price" property:
$ microdata -format '{{with index .Items 0}}{{with index .Properties "offers" 0}}{{with index .Properties "price" 0 }}{{ . }}{{end}}{{end}}{{end}}' https://www.gog.com/game/...
8.99
Features
- Windows/BSD/Linux supported
- Format output with Go templates
- Parse from Stdin
Contribution
Bug reports and feature requests are welcome. Follow GiHub's guide to using-pull-requests
Go Package
package main
import (
"encoding/json"
"os"
"github.com/namsral/microdata"
)
func main() {
var data microdata.Microdata
data, _ = microdata.ParseURL("http://example.com/blogposting")
b, _ := json.MarshalIndent(data, "", " ")
os.Stdout.Write(b)
}
For documentation see godoc.org/github.com/namsral/microdata.