Freeze-dry: web page conservation

Freeze-dry stores a web page as it is shown in the browser. It takes the DOM, and returns it as an
HTML string, after having and inlined external resources such as images and stylesheets (as data:
URLs).

It also ensures the snapshot is static and completely offline: all scripts are removed, and any
attempt at internet connectivity is blocked by adding a content security policy. The resulting HTML
document is a static, self-contained snapshot of the page.

For more details about how this exactly works, see src/Readme.md.

Usage

const html = await freezeDry(document, options)

The options object is optional, and even document can be omitted, in which case it will default
to window.document. Possible options are:

timeout (number): Maximum time (in milliseconds) spent on fetching the page's subresources. The
resulting HTML will have only succesfully fetched subresources inlined.
docUrl (string): overrides the documents's URL. This will influence the expansion of relative
URLs, and is useful for cases where the document was constructed dynamically (e.g. using
DOMParser).
addMetadata (boolean): If true (the default), a meta and link tag will be added to the
returned html, noting the documents URL and time of snapshotting (that is, the current time).

The meta data mimics the HTTP headers defined for the Memento protocol. The added headers look
like so:
```
<meta http-equiv="Memento-Datetime" content="Sat, 18 Aug 2018 18:02:20 GMT">
<link rel="original" href="https://example.com/main/page.html">
```
keepOriginalAttributes (boolean): If true (the default), preserves the original value of an
element attribute if its URLs are inlined, by noting it as a new data-original-... attribute.
For example, <img src="bg.png"> would become <img src="data:..." data-original-src="bg.png">.
Note this is an unstandardised workaround to keep URLs of subresources available; unfortunately
URLs inside stylesheets are still lost.
now (Date): Overrides the snapshot time (only relevant when addMetadata is true). Mainly
intended for testing purposes.
fetchResource: custom function for fetching resources; should be API-compatible with the global
fetch(), but may also return an object { blob, url } instead of a Response.

Note that the resulting string can easily be several megabytes when pages contain images, videos,
fonts, etcetera.

Name With Owner	WebMemex/freeze-dry
Primary Language	TypeScript
Program language	HTML (Language Count: 4)
Platform
License:	The Unlicense

Created At	2017-07-13 23:31:40
Pushed At	2022-09-18 15:22:13
Last Commit At
Release Count	12
Last Release Name	v1.0.0 (Posted on )
First Release Name	v0.1.0 (Posted on )

Stargazers Count	294
Watchers Count	10
Fork Count	20
Commits Count	269
Has Issues Enabled
Issues Count	48
Issue Open Count	21
Pull Requests Count	8
Pull Requests Open Count	0
Pull Requests Close Count	5

Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private

freeze-dry

Github stars Tracking Chart

Freeze-dry: web page conservation

Usage

Main metrics