1

StarRocks analytical DB heads to Linux Foundation

 1 year ago
source link: https://venturebeat.com/data-infrastructure/starrocks-analytical-database-heads-to-the-linux-foundation-to-expand-open-source-project-contributions/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

StarRocks analytical DB heads to Linux Foundation

picture of a business person holding a stylus and tapping an in-desk data screen. Data analytics, data analysis
Image Credit: everything possible/Shutterstock

Check out all the on-demand sessions from the Intelligent Security Summit here.


The StarRocks online analytical processing (OLAP) database is getting a new home today, at the Linux Foundation.

Want must read news straight to your inbox?
Sign up for VB Daily

StarRocks was created and developed by a commercial entity also known as StarRocks, until it changed its name to CelerData in August 2022. The project got its start in 2020, originally as a fork of the open-source Apache Doris analytics database.

Over the last two years, StarRocks has diverged significantly from Doris and taken a different path, with 80% of the code being entirely new. In particular, StarRocks has developed to become an MPP (massive parallel processing) OLAP database enabling rapid real-time query support for analytics workloads. The company and the technology have also increasingly focused on supporting data analytics for data lakes.

To date, the StarRocks database has been managed as an open-source project, governed and maintained by StarRocks Inc. (now CelerData), which has also built a commercial cloud service announced in July 2022.  A challenge that faces any open-source project is the issue of contribution and helping to ensure that organizations and developers are able to contribute code. That’s why CelerData has decided it’s time for a new home at the Linux Foundation.

Event

Intelligent Security Summit On-Demand

Learn the critical role of AI & ML in cybersecurity and industry specific case studies. Watch on-demand sessions today.

Watch Here

“StarRocks was originally the open-source project and the company name, and we found that our contributors from other companies had some concerns,” Li Kang, VP of strategy at CelerData, told VentureBeat. “We are committed to building an open-source project as well as a community around that project.”

Data lakes as competitive space

The market for open source–based query engines for data analytics is an increasingly competitive space.

There are multiple open-source efforts that StarRocks competes against that Kang said are often present in competitive evaluations. Among them is the Apache Druid project, which is also an open-source, real-time analytics database. Druid benefits from the commercial backing of database startup Imply, which raised $100 million to advance the technology in May 2022.

There is also the Apache Pinot analytics database project, which is backed by commercial vendor StarTree, and raised $47 million in August 2022.

Kang said StarRocks aims to be differentiated from its rivals through its optimized data pipeline architecture and query acceleration approach. The move to the Linux Foundation, as opposed to having the project at the Apache Software Foundation (ASF), will also help to differentiate StarRocks.

Unlike the ASF, the Linux Foundation is not particularly well known for its open-source database efforts. The ASF is, of course, home to the Hadoop big data ecosystem of projects, as well as a long list of foundational data technologies including Kafka, Spark and Parquet.

The Linux Foundation has a division known as the LF AI and Data collaborative project that hosts database projects, but that’s not where StarRocks is likely headed. Rather, Kang said that the intention is to see the platform eventually as part of the Cloud Native Computing Foundation (CNCF), which is also home to the Kubernetes container orchestration project.

For StarRocks, the goal is to be recognized as a cloud-native database platform. Currently, StarRocks can be deployed using containers in a cloud-native architectural approach.

It’s important to note that as of today, StarRocks is not part of the CNCF. Rather it is being contributed as a standalone project that could be considered for inclusion into the CNCF at a future point. StarRocks won’t be the only standalone data project at the Linux Foundation, either. Databricks’ Delta Lake open-source data lakehouse technology is also currently hosted as a standalone project at the Linux Foundation.

Why StarRocks is going to the Linux Foundation

Simply saying a technology like a database is open source is not enough to actually build an open-source community. But the step to the Linux Foundation does offer chances to succeed, as described by a Foundation official.

“The Linux Foundation will support StarRocks by implementing best practices in the establishment of clear and transparent governance processes,” Hilary Carter, SVP of research at the Linux Foundation, told VentureBeat.

Carter added that the Linux Foundation will also be able to assist with community building for StarRocks. That includes opening up decision-making to a wide range of stakeholders, offering enhanced cloud-based collaboration tools for meeting and community management, and leaning on the expertise of other project leaders to provide guidance as needed.

“While each open-source project is unique and has its own set of challenges and requirements, we draw on our experience with other open-source projects to ensure that newly contributed projects like StarRocks have every opportunity to succeed,” Carter said.

In terms of getting StarRocks in the CNCF, Carter said that ultimately, the decision to accept the project is made by the CNCF Technical Oversight Committee (TOC), per their governance model. That said, she noted that the Linux Foundation can help StarRocks prepare submissions and provide guidance on the requirements and processes of the CNCF.

The future of StarRocks is …

As an open-source effort, hosted at the Linux Foundation, the StarRocks database will continue to be developed and expanded.

Kang said the CelerData cloud service for StarRocks will continue to rely on the open-source code as its base. He added that in recent months there has been an increasing effort to further optimize StarRocks for cloud-native deployments and that effort will continue. He also hinted at new development efforts to enhance the separation of compute and data lake storage options for even faster queries that will come in future updates to StarRocks.

“Being part of the Linux Foundation opens doors to more contributors,” Kang said. “We already have committers from other companies right now, but we expect that we will see more committers and contributors.”

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK