Apache Pegasus

Apache Pegasus -- 一个横向可扩展、强一致性和高性能的键值存储。「Apache Pegasus - A horizontally scalable, strongly consistent and high-performance key-value store」

Github星跟踪图

pegasus-logo

Lint and build regularly
License
Releases

Note: The master branch may be in an unstable or even in a broken state during development.
Please use GitHub Releases instead of the master branch in order to get stable binaries.

Apache Pegasus is a distributed key-value storage system which is designed to be:

  • horizontally scalable: distributed using hash-based partitioning
  • strongly consistent: ensured by PacificA consensus protocol
  • high-performance: using RocksDB as underlying storage engine
  • simple: well-defined, easy-to-use APIs

Background

Pegasus targets to fill the gap between Redis and HBase. As the former
is in-memory, low latency, but does not provide a strong-consistency guarantee.
And unlike the latter, Pegasus is entirely written in C++ and its write-path
relies merely on the local filesystem.

Apart from the performance requirements, we also need a storage system
to ensure multiple-level data safety and support fast data migration
between data centers, automatic load balancing, and online partition split.

Features

  • Persistence of data: Each write is replicated three-way to different ReplicaServers before responding to the client. Using PacificA protocol, Pegasus has the ability for strong consistent replication and membership changes.

  • Automatic load balancing over ReplicaServers: Load balancing is a builtin function of MetaServer, which manages the distribution of replicas. When the cluster is in an inbalance state, the administrator can invoke a simple rebalance command that automatically schedules the replica migration.

  • Cold Backup: Pegasus supports an extensible backup and restore mechanism to ensure data safety. The location of snapshot could be a distributed filesystem like HDFS or local filesystem. The snapshot storing in the filesystem can be further used for analysis based on pegasus-spark.

  • Eventually-consistent intra-datacenter replication: This is a feature we called duplication. It allows a change made in the local cluster accesible after a short time period by the remote cluster. It help achieving higher availability of your service and gaining better performance by accessing only local cluster.

To start using Pegasus

See our documentation on the Pegasus Website.

Client drivers

Pegasus has support for several languages:

Contact us

  • Send emails to the Apache Pegasus developer mailing list: dev@pegasus.apache.org. This is the place where topics around development, community, and problems are officially discussed. Please remember to subscribe to the mailing list via dev-subscribe@pegasus.apache.org.

  • GitHub Issues: submit an issue when you have any idea to improve Pegasus, and when you encountered some bugs or problems.

Test tools:

Data import/export tools:

License

Copyright 2022 The Apache Software Foundation. Licensed under the Apache License, Version 2.0:
http://www.apache.org/licenses/LICENSE-2.0

主要指标

概览
名称与所有者apache/incubator-pegasus
主编程语言C++
编程语言Shell (语言数: 13)
平台
许可证Apache License 2.0
所有者活动
创建于2015-09-01 02:29:37
推送于2025-05-06 04:19:04
最后一次提交2025-04-28 20:52:49
发布数57
最新版本名称v2.5.0 (发布于 2023-12-11 12:27:34)
第一版名称v1.7.0 (发布于 2018-03-05 11:01:18)
用户参与
星数2k
关注者数95
派生数311
提交数5.9k
已启用问题?
问题数780
打开的问题数198
拉请求数1289
打开的拉请求数45
关闭的拉请求数123
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?