Hawk: 先进的ETL和爬虫流引擎

用 C#/WPF 编写的可视化爬虫和 ETL 集成开发环境。「visualized crawler & ETL IDE written with C#/WPF」

Github stars Tracking Chart

Hawk: Advanced ETL & Crawler Stream Engine


Welcome to use Hawk! Hawk can graphically crawle webpage, clean, process and save data without programming, written in C#/WPF, open source according to the GPL protocol.

Introduction

The meaning of Hawk is "Eagle", which can kill prey efficiently and accurately. Its design idea comes from Lisp language , and its function mimics awk in Unix.

The key features are as follows:

  • Intelligent analysis of web content without programming.
  • WYSIWYG, conversion, filtering & storage with visually drag and drop.
  • Parallel processing & high speed
  • Support mutli-file & database: xml, csv, sqlite,mongodb...
  • Tasks can be save, pause, restart & reused.
  • Focus on crawler but power is far beyond that.

splash

Fast & Smart webpage crawling :

1.gif-1001.8kB

WYSIWYG ETL:

2.gif


欢迎使用Hawk! HAWK无需编程,可见即所得的图形化数据采集和清洗工具,依据GPL协议开源。

介绍

Hawk的含义为“鹰”,能够高效,准确地捕杀猎物。它的思想来源于Lisp语言,功能模仿了Linux工具awk。

特点如下:

  • 智能分析网页内容,无需编程
  • 所见即所得,可视化拖拽,快地实现转换和过滤等数据清洗操作
  • 能从各类数据库和文件实现导入导出
  • 任务可以被保存和复用
  • 其最适合的领域是爬虫和数据清洗,但其威力远超于此。

Main metrics

Overview
Name With Ownerferventdesert/Hawk
Primary LanguageC#
Program languageC# (Language Count: 1)
Platform
License:Apache License 2.0
所有者活动
Created At2016-04-02 07:54:41
Pushed At2019-12-21 10:26:40
Last Commit At2019-12-21 16:25:29
Release Count7
Last Release Namev5.2 (Posted on )
First Release Name2.0 (Posted on )
用户参与
Stargazers Count3.2k
Watchers Count284
Fork Count1k
Commits Count286
Has Issues Enabled
Issues Count123
Issue Open Count66
Pull Requests Count4
Pull Requests Open Count1
Pull Requests Close Count1
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private