Kraken

P2P Docker registry 能够在几秒钟内分发 TB 的数据。「P2P Docker registry capable of distributing TBs of data in seconds」

Github stars Tracking Chart

Kraken

Kraken 是一个基于 P2P 的 Docker registry,专注于可伸缩性和可用性。 它设计用于混合云环境中的 Docker 映像管理、复制和分发。借助可插入的后端支持,Kraken 可以轻松地作为分布层集成到现有 Docker registry 设置中。

从2018年初开始,Kraken 就在用于 Uber 生产环境。在我们最繁忙的集群中,Kraken 每天分发超过 100 万个 Blob,其中包括 10 万个 1G+ Blob。 在峰值生产负荷下,Kraken 会在 30 秒内分配 20K 100MB-1G blobs。

以下是工作中的小型 Kraken 群集的可视化效果:

特性

以下是 Kraken 的一些亮点:

  • 高度可扩展。 Kraken 能够在每台主机上以大于最大下载速度 50% 的速度分发 Docker 映像。集群大小和映像大小对下载速度没有显著影响。
    • 每个群集至少支持 1.5 万台主机。
    • 支持任意大的 blobs/层。为了获得最佳性能,我们通常将最大大小限制为 20G。
  • 高度可用。没有组件是单点故障。
  • 安全。通过 TLS 支持上传者身份验证和数据完整性保护。
  • 可插拔的存储选项。 Kraken 无需管理数据,而是插入可靠的 Blob 存储选项,例如 S3、GCS、ECR、HDFS 或其他注册表。存储界面很简单,新选项也很容易添加。
  • 无损跨集群复制。 Kraken 支持集群之间基于规则的异步复制。
  • 最小的依赖关系。除了可插入存储之外,Kraken 对 DNS 仅具有可选的依赖关系。

设计

Kraken 的高级思想是让少量专用主机将内容播撒到集群中每个主机上运行的代理网络中。

跟踪器(tracker)是一个中心组件,它将协调网络中的所有参与者以形成伪随机正则图。

这样的图具有较高的连通性和较小的直径。结果,即使只有一台播种机,并且在同一秒内有成千上万个对等方加入,理论上所有参与者的最大上传/下载速度都至少可以达到 80%(当前实现为 60%),并且性能不会随着 blobs 大小和簇大小的增加而降低。有关更多详细信息,请参阅 KubeCon + CloudNativeCon 上的团队技术讲座

架构

  • Agent(代理)
    • 部署在每台主机上
    • 实现 Docker registry 接口
    • 向 tracker(跟踪器)发布可用内容
    • 连接到由 tracker(跟踪器)返回的同位体以下载内容
  • Origin(源)
    • 专用播种机(seeders)
    • 将 Blob 作为文件存储在可插拔存储(例如 S3、GCS、ECR)支持的磁盘上
    • 形成一个自我修复的哈希环以分配负载
  • tracker(跟踪器)
    • 跟踪哪些对等方具有哪些内容(进行中和已完成)
    • 提供针对任何给定 Blob 进行连接的对等对象的有序列表
  • Proxy(代理)
    • 实现 Docker registry 接口
    • 将每个图像层上载到负责的源点(origin。请记住,源点形成一个散列环)
    • 将标签上传到构建索引
  • Build-Index(建立索引)
    • 可读标签到 Blob 摘要的映射
    • 没有一致性保证:客户应使用唯一标签
    • 增强群集之间的图像复制(具有重试的简单重复队列)
    • 将标签作为文件存储在可插拔存储(例如 S3、GCS、ECR)支持的磁盘上

基准测试

以下数据来自一项测试,其中 2600 台主机同时下载了 2 层 3G Docker 映像(下载了 5200 个 Blob),所有代理的速度限制为 300MB/s(使用 5 个跟踪器和 5 个源):

  • p50 = 10s (at speed limit)
  • p99 = 18s
  • p99.9 = 22s


用法

所有 Kraken 组件都可以部署为 Docker 容器。要构建 Docker 映像:

$ make images

有关如何配置和使用 Kraken 的信息,请参阅文档

Kubernetes 上的 Kraken

您可以使用示例 Helm 图表在 k8s 集群上部署 Kraken(带有示例 HTTP 文件服务器后端):

$ helm install --name=kraken-demo ./helm

部署后,每个节点都将在 localhost:30081 上公开一个 docker Registry API。例如从 Kraken 代理提取图像的 pod 规范,请参见示例

有关k8s设置的更多信息,请参见自述文件

开发集群(Devcluster)

要启动一个牧群容器(herd container,包含源、跟踪器、构建索引和代理)以及两个具有开发配置的代理容器:

$ make devcluster

在笔记本电脑上运行 dev-cluster 时需要 Docker-for-Mac。有关 devcluster 的更多信息,请查看 devcluster README

与其他项目的比较

来自阿里巴巴的蜻蜓(Dragonfly)

Dragonfly 群集具有一个或几个“超级节点”,用于协调群集中每 4MB 数据块的传输。

尽管超级节点能够做出最佳决策,但整个群集的吞吐量受到一台或几台主机的处理能力的限制,并且随着 blob 大小或群集大小的增加,性能将线性下降。

Kraken 的跟踪器仅有助于协调连接图,而无需将实际数据传输的协商交给各个对等方,因此 Kraken 可以在较大的 blobs 时更好地扩展。最重要的是,Kraken 是 HA,并支持跨集群复制,这是可靠的混合云设置所必需的。

BitTorrent

Kraken 最初是使用 BitTorrent 驱动程序构建的,但是最终我们实现了基于 BitTorrent 协议的 P2P 驱动程序,以实现与存储解决方案的更紧密集成以及对性能优化的更多控制。

Kraken 的问题空间与 BitTorrent 的设计目的稍有不同。Kraken 的目标是在稳定的环境中减少全局最大下载时间并减少通信开销,而 BitTorrent 则是为无法预测和对抗的环境而设计的,因此它需要保留更多的稀缺数据副本并防御恶意或行为不端的同行。

尽管存在差异,我们仍会不时重新检查 Kraken 的协议,如果可行,我们希望使其再次与 BitTorrent 兼容。

局限性

  • 如果 Docker registry 吞吐量不是部署工作流中的瓶颈,那么切换到 Kraken 不会神奇地加快 Docker 拉动速度。为了加快 docker pull,请考虑切换到 Makisu 来提高构建时的层可重用性,或者调整压缩比,因为 docker pull 大部分时间都在数据解压缩上。
  • 允许更改标签(例如更新最新标签),但是,有些事情将不起作用:由于 Nginx 缓存,标签查找之后仍将立即返回旧值,并且复制可能不会触发。我们正在努力更好地支持此功能。如果您现在需要标签突变支持,请减少构建索引组件的缓存间隔。如果您还需要在多集群设置中进行复制,请考虑将另一个 Docker registry 设置为 Kraken 的后端。
  • 从理论上讲,Kraken 应该分配任何大小的 Blob,而不会显着降低性能,但是在 Uber,我们强制实施 20G 限制,并且不能认可超大 Blob(即 100G+)的生产使用。对等方基于每个 Blob 强制执行连接限制,如果没有任何对等方很快成为种子方,则新的对等方可能会饿死。如果您要分发的超大 Blob,建议您先将它们分成 <10G 块。

贡献

请查看我们的指南

联系

要与我们联系,请加入我们的 Slack 频道


(The first version translated by vz on 2020.08.01)

Main metrics

Overview
Name With Owneruber/kraken
Primary LanguageGo
Program languageMakefile (Language Count: 9)
PlatformKubernetes, Linux, Mac
License:Apache License 2.0
所有者活动
Created At2018-12-06 06:04:35
Pushed At2025-04-22 10:46:52
Last Commit At2025-04-14 13:32:53
Release Count5
Last Release Namev0.1.4 (Posted on )
First Release Namev0.1.0 (Posted on )
用户参与
Stargazers Count6.3k
Watchers Count83
Fork Count431
Commits Count0.9k
Has Issues Enabled
Issues Count109
Issue Open Count62
Pull Requests Count227
Pull Requests Open Count25
Pull Requests Close Count38
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private

Kraken is a P2P-powered Docker registry that focuses on scalability and availability. It is
designed for Docker image management, replication and distribution in a hybrid cloud environment.
With pluggable backend support, Kraken can easily integrate into existing Docker registry setups
as the distribution layer.

Kraken has been in production at Uber since early 2018. In our busiest cluster, Kraken distributes
more than 1 million blobs per day, including 100k 1G+ blobs. At its peak production load, Kraken
distributes 20K 100MB-1G blobs in under 30 sec.

Below is the visualization of a small Kraken cluster at work:

Table of Contents

Features

Following are some highlights of Kraken:

  • Highly scalable. Kraken is capable of distributing Docker images at > 50% of max download
    speed limit on every host. Cluster size and image size do not have significant impact on download
    speed.
    • Supports at least 15k hosts per cluster.
    • Supports arbitrarily large blobs/layers. We normally limit max size to 20G for best performance.
  • Highly available. No component is a single point of failure.
  • Secure. Support uploader authentication and data integrity protection through TLS.
  • Pluggable storage options. Instead of managing data, Kraken plugs into reliable blob storage
    options, like S3, GCS, ECR, HDFS or another registry. The storage interface is simple and new options
    are easy to add.
  • Lossless cross cluster replication. Kraken supports rule-based async replication between
    clusters.
  • Minimal dependencies. Other than pluggable storage, Kraken only has an optional dependency on
    DNS.

Design

The high level idea of Kraken is to have a small number of dedicated hosts seeding content to a
network of agents running on each host in the cluster.

A central component, the tracker, will orchestrate all participants in the network to form a
pseudo-random regular graph.

Such a graph has high connectivity and small diameter. As a result, even with only one seeder and
having thousands of peers joining in the same second, all participants can reach a mininum of 80%
max upload/download speed in theory (60% with current implementation), and performance doesn't
degrade much as the blob size and cluster size increase. For more details, see the team's tech
talk
at KubeCon + CloudNativeCon.

Architecture

  • Agent
    • Deployed on every host
    • Implements Docker registry interface
    • Announces available content to tracker
    • Connects to peers returned by tracker to download content
  • Origin
    • Dedicated seeders
    • Stores blobs as files on disk backed by pluggable storage (e.g. S3, GCS, ECR)
    • Forms a self-healing hash ring to distribute load
  • Tracker
    • Tracks which peers have what content (both in-progress and completed)
    • Provides ordered lists of peers to connect to for any given blob
  • Proxy
    • Implements Docker registry interface
    • Uploads each image layer to the responsible origin (remember, origins form a hash ring)
    • Uploads tags to build-index
  • Build-Index
    • Mapping of human readable tag to blob digest
    • No consistency guarantees: client should use unique tags
    • Powers image replication between clusters (simple duplicated queues with retry)
    • Stores tags as files on disk backed by pluggable storage (e.g. S3, GCS, ECR)

Benchmark

The following data is from a test where a 3G Docker image with 2 layers is downloaded by 2600 hosts
concurrently (5200 blob downloads), with 300MB/s speed limit on all agents (using 5 trackers and
5 origins):

  • p50 = 10s (at speed limit)
  • p99 = 18s
  • p99.9 = 22s

Usage

All Kraken components can be deployed as Docker containers. To build the Docker images:

$ make images

For information about how to configure and use Kraken, please refer to the documentation.

Kraken on Kubernetes

You can use our example Helm chart to deploy Kraken (with an example http fileserver backend) on
your k8s cluster:

$ helm install --name=kraken-demo ./helm

Once deployed, each and every node will have a docker registry API exposed on localhost:30081.
For an example pod spec that pulls images from Kraken agent, see example.

For more information on k8s setup, see README.

Devcluster

To start a herd container (which contains origin, tracker, build-index and proxy) and two agent
containers with development configuration:

$ make devcluster

Docker-for-Mac is required for making dev-cluster work on your laptop.
For more information on devcluster, please check out devcluster README.

Comparison With Other Projects

Dragonfly from Alibaba

Dragonfly cluster has one or a few "supernodes" that coordinates transfer of every 4MB chunk of data
in the cluster.

While the supernode would be able to make optimal decisions, the throughput of the whole cluster is
limited by the processing power of one or a few hosts, and the performance would degrade linearly as
either blob size or cluster size increases.

Kraken's tracker only helps orchestrate the connection graph, and leaves negotiation of actual data
transfer to individual peers, so Kraken scales better with large blobs.
On top of that, Kraken is HA and supports cross cluster replication, both are required for a
reliable hybrid cloud setup.

BitTorrent

Kraken was initially built with a BitTorrent driver, however we ended up implementing our own P2P
driver based on BitTorrent protocol to allow for tighter integration with storage solutions and more
control over performance optimizations.

Kraken's problem space is slightly different than what BitTorrent was designed for. Kraken's goal is
to reduce global max download time and communication overhead in a stable environment, while
BitTorrent was designed for an unpredictable and adversarial environment, so it needs to preserve more
copies of scarce data and defend against malicious or bad behaving peers.

Despite the differences, we re-examine Kraken's protocol from time to time, and if it's feasible, we
hope to make it compatible with BitTorrent again.

Limitations

  • If Docker registry throughput is not the bottleneck in your deployment workflow, switching to
    Kraken will not magically speed up your docker pull. To actually speed up docker pull, consider
    switching to Makisu to improve layer reusability at build time, or
    tweak compression ratios, as docker pull spends most of the time on data decompression.
  • Mutating tags (e.g. updating a latest tag) is allowed, however a few things will not work: tag
    lookups immediately afterwards will still return the old value due to Nginx caching, and replication
    probably won't trigger. We are working on supporting this functionality better. If you need tag
    mutation support right now, please reduce cache interval of build-index component. If you also need
    replication in a multi-cluster setup, please consider setting up another Docker registry as Kraken's
    backend.
  • Theoretically, Kraken should distribute blobs of any size without significant performance
    degradation, but at Uber we enforce a 20G limit and cannot endorse of the production use of
    ultra-large blobs (i.e. 100G+). Peers enforce connection limits on a per blob basis, and new peers
    might be starved for connections if no peers become seeders relatively soon. If you have ultra-large
    blobs you'd like to distribute, we recommend breaking them into <10G chunks first.

Contributing

Please check out our guide.

Contact

To contact us, please join our Slack channel.