StarRocks

StarRocks 是 Linux 基金会的一个项目,它是下一代亚秒级 MPP OLAP 数据库,适用于全面分析场景,包括多维分析、实时分析和临时查询。荣获 InfoWorld 颁发的 2023 年最佳开源软件 BOSSIE 奖。「StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.」

Github星跟蹤圖

StarRocks

StarRocks is a new-generation and high-speed MPP database for nearly all data analytics scenarios. We wish to provide easy and rapid data analytics. Users can directly conduct high-speed data analytics in various scenarios without complicated data preprocessing. Query speed (especially multi-tables JOIN queries) far exceeds similar products because of our streamlined architecture, full vectorized engine, newly-designed Cost-Based Optimizer (CBO) and modern materialized views. We also support efficient real-time data analytics.

Moreover, StarRocks provides flexible and diverse data modeling, such as flat-tables, star schema, and snowflake schema. Compatible with MySQL protocols and standard SQL syntax, StarRocks can communicate smoothly across the MySQL ecosystem, for example, MySQL clients and common BI tools. It is an integrated data analytics platform that allows for high availability and simple maintenance and doesn’t rely on any other external components.

We recommend you read the Introduction to StarRocks first.

Architecture

StarRocks’s streamlined architecture is mainly composed of two modules, Frontend (FE for short) and Backend (BE for short), and doesn’t depend on any external components, which makes it easy to deploy and maintain. Meanwhile, the entire system eliminates single points of failure through seamless and horizontal scaling of FE and BE, as well as replication of meta-data and data.

Architecture of StarRocks

Technology

  • Native vectorized SQL engine: StarRocks adopts vectorization technology to make full use of the parallel computing power of CPU, achieving sub-second query returns in multi-dimensional analyses, which is 5 to 10 times faster than previous systems.
  • Simple architecture: StarRocks does not rely on any external systems. The simple architecture makes it easy to deploy, maintain and scale out. StarRocks also provides high availability, reliability, scalability and fault tolerance.
  • Standard SQL: StarRocks supports ANSI SQL syntax (fully supported TPC-H and TPC-DS). It is also compatible with the MySQL protocol. Various clients and BI software can be used to access StarRocks.
  • Smart query optimization: StarRocks can optimize complex queries through CBO (Cost Based Optimizer). With a better execution plan, the data analysis efficiency will be greatly improved.
  • Realtime update: The updated model of StarRocks can perform upsert/delete operations according to the primary key, and achieve efficient query while concurrent updates.
  • Intelligent materialized view: The materialized view of StarRocks can be automatically updated during the data import and automatically selected when the query is executed.
  • Convenient query federation: StarRocks allows direct access to data from Hive, MySQL and Elasticsearch without importing.

Use cases

StarRocks can provide satisfying performance in various data analytics scenarios, including multi-dimensional screening and analysis, real-time data analytics, ad hoc analysis. StarRocks also supports thousands of concurrent users. As a result, StarRocks is widely used by companies in business intelligence, real-time data warehouse, user profiling, dashboards, order analysis, operation, and monitoring analysis, anti-fraud, and risk control. At present, over 100 medium-sized and large enterprises in various industries have used StarRocks in their online production environment, including Airbnb, JD.com, Tencent, Trip.com and other well-known companies. There are thousands of StarRocks servers running stably in the production environment.

Fork

StarRocks forked from Apache Doris(incubating) 0.13 in early 2020. We recreated many important parts of the database from then, including a full vectorized execution engine, a brand new CBO optimizer, a novel real-time update engine, and query federation for data lakes.

Today, there are only about 30% of the code in StarRocks is identical to Apache Doris(incubating).

Build

Because of the thirdparty dependencies, we recommend building StarRocks with the development docker image we provide.

For detailed instructions, please refer to build.

Install

Download the current release here.
For detailed instructions, please refer to deploy.

Community

LICENSE

Code in this repository is provided under the Elastic License 2.0. Some portions are available under Apache License 2.0. Please see our FAQ.

Contributing to StarRocks

A big thanks for your attention to StarRocks!
In order to accept your pull request, please follow the CONTRIBUTING.md.

概覽

名稱與所有者StarRocks/starrocks
主編程語言Java
編程語言CMake (語言數: 16)
平台
許可證Apache License 2.0
發布數204
最新版本名稱3.1.11 (發布於 )
第一版名稱1.19.0 (發布於 )
創建於2021-09-04 02:29:35
推送於2024-05-01 05:02:34
最后一次提交
星數7.8k
關注者數206
派生數1.6k
提交數16.8k
已啟用問題?
問題數7052
打開的問題數609
拉請求數31007
打開的拉請求數709
關閉的拉請求數6130
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?
去到頂部