pywb

Core Python Web Archiving Toolkit for replay and recording of web archives

Github星跟蹤圖

Webrecorder pywb 2.7

.. image:: https://raw.githubusercontent.com/webrecorder/pywb/main/pywb/static/pywb-logo.png

.. image:: https://github.com/webrecorder/pywb/workflows/CI/badge.svg
:target: https://github.com/webrecorder/pywb/actions
.. image:: https://codecov.io/gh/webrecorder/pywb/branch/main/graph/badge.svg
:target: https://codecov.io/gh/webrecorder/pywb

Web Archiving Tools for All

View the full pywb documentation <https://pywb.readthedocs.org>_

pywb is a Python (2 and 3) web archiving toolkit for replaying web archives large and small as accurately as possible.
The toolkit now also includes new features for creating high-fidelity web archives.

This toolset forms the foundation of Webrecorder project, but also provides a generic web archiving toolkit
that is used by other web archives, including the traditional "Wayback Machine" functionality.

New Features
^^^^^^^^^^^^

The 2.x release included a major overhaul of pywb and introduces many new features, including the following:

  • Dynamic multi-collection configuration system with no-restart updates.

  • New recording capability to create new web archives from the live web or other archives.

  • Componentized architecture with standalone Warcserver, Recorder and Rewriter components.

  • Support for Memento API aggregation and fallback chains for querying multiple remote and local archival sources.

  • HTTP/S Proxy Mode with customizable certificate authority for proxy mode recording and replay.

  • Flexible rewriting system with pluggable rewriters for different content-types.

  • Standalone, modular client-side rewriting system (wombat.js) <https://github.com/webrecorder/wombat>_ to handle most modern web sites.

  • Improved 'calendar' query UI with incremental loading, grouping results by year and month, and updated replay banner.

  • Extensible UI customizations system for modifying all aspects of the UI.

  • Robust access control system for blocking or excluding URLs, by prefix or by exact match.

  • New in 2.6: Access Control embargo and http-header control access settings.

  • New in 2.6: Support for localization and multi-language deployment.

  • New in 2.7: New banner/calendar UI written in Vue <https://vuejs.org/>_, with interactive timeline and easier theming of colors and logo via config.yaml.

Please see the full documentation <https://pywb.readthedocs.org>_ for more detailed info on all these features.

Installation for Deployment

To install pywb for usage, you can use:

pip install pywb

Note: depending on your Python installation, you may have to use pip3 instead of pip.

Installation from local copy

git clone https://github.com/webrecorder/pywb

To install from a locally cloned copy, install with pip install -e . or python setup.py install.

To run tests, we recommend installing pip install tox tox-current-env and then running tox --current-env to test in your current Python environment.

To Build docs locally, run: cd docs; make html. (The docs will be built in ./_build/html/index.html)

Running

After installation, you can run pywb or wayback.

Consult the local or online docs <https://pywb.readthedocs.org>_ for latest usage and configuration details.

Documentation

The pywb documentation is extensive. Some links to a few key guides:

  • Getting Started Guide <https://pywb.readthedocs.io/en/latest/manual/usage.html#getting-started>_

  • Embargo and Access Control Guide <https://pywb.readthedocs.io/en/latest/manual/access-control.html>_

  • Localization and Multi-Language Guide <https://pywb.readthedocs.io/en/latest/manual/localization.html>_

  • Deployment Guide <https://pywb.readthedocs.io/en/latest/manual/usage.html#deployment>_

  • OpenWayback Transition Guide <https://pywb.readthedocs.io/en/latest/manual/owb-transition.html>_

Contributions & Bug Reports

Users are encouraged to fork and contribute to this project to keep improving web archiving tools. Please consult the contributing guide <CONTRIBUTING.md>_ for information on how to contribute to pywb.

主要指標

概覽
名稱與所有者webrecorder/pywb
主編程語言JavaScript
編程語言Python (語言數: 8)
平台
許可證GNU General Public License v3.0
所有者活动
創建於2013-12-09 03:30:31
推送於2025-04-18 11:20:46
最后一次提交
發布數64
最新版本名稱v-2.8.4 (發布於 )
第一版名稱0.2.2 (發布於 )
用户参与
星數1.5k
關注者數60
派生數228
提交數2.3k
已啟用問題?
問題數487
打開的問題數156
拉請求數378
打開的拉請求數15
關閉的拉請求數46
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?