wikiracer

Finds the shortest path between two Wikipedia articles, using only Wikipedia links.

  • 所有者: meditativeape/wikiracer
  • 平台:
  • 許可證: MIT License
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

wikiracer

Finds a path between two Wikipedia articles, using only Wikipedia links.

Approach

Wikiracer runs a one-way parallel BFS (Breadth First Search) from the given start URL to crawl the graph of Wikipedia articles until it reaches the target URL.

At each level of BFS, the work is shared across a number of Goroutines. These Goroutines fetch work from a common input channel, which streams links found by Goroutines for the previous level, crawl the articles, and send links found in these articles to another common output channel. The main function collects the output into an array, removes duplicates and links that have already been crawled, and starts the next batch of Goroutines to crawl the new links.

For simplicity, Wikiracer only uses English articles (URL prefix: en.wikipedia.org/wiki/).

Installation

$ go get github.com/meditativeape/wikiracer
$ cd $GOPATH/src/github.com/meditativeape/wikiracer
$ make install

Usage

Start wikiracer by running wikiracer. It spins up an HTTP server that listens on port 8080.

Wikiracer offers one REST endpoint, POST /race, that expects two keys in the POST form: startUrl and endUrl. It returns the path found in JSON format. You could use your favorite client, such as cURL or Postman, to query against this endpoint.

Example request as a cURL command:

curl localhost:8080/race -F startUrl=https://en.wikipedia.org/wiki/Computer_programming -F endUrl=https://en.wikipedia.org/wiki/Blade_Runner 

Logging

Wikiracer keeps a lightweighted log under /tmp/wikiracer/service.log.

License

The MIT License

主要指標

概覽
名稱與所有者meditativeape/wikiracer
主編程語言Go
編程語言Go (語言數: 2)
平台
許可證MIT License
所有者活动
創建於2017-10-23 00:39:46
推送於2017-10-27 18:02:26
最后一次提交2017-10-27 11:02:19
發布數0
用户参与
星數33
關注者數1
派生數1
提交數11
已啟用問題?
問題數0
打開的問題數0
拉請求數0
打開的拉請求數0
關閉的拉請求數0
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?