3

Supercloud debate: Is open-source standardization the way forward? Databricks CE...

 2 years ago
source link: https://siliconangle.com/2022/08/09/supercloud-debate-open-source-standardization-way-forward-databricks-ceo-weighs-supercloud22/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Supercloud debate: Is open-source standardization the way forward? Databricks CEO weighs in

Ali-Ghodsi-Databricks-Inc-Supercloud22.png
CLOUD

Supercloud is an emerging trend in enterprise computing that is predicted to bring major changes to how companies build out their cloud architecture.

Over the past six months, SiliconANGLE Media has been following the increase in companies considering supercloud as a way to get rid of multicloud complexity and help their customers monetize data assets.

Building a supercloud isn’t a one-size-fits-all project. There are as many flavors of supercloud as there are choices for cloud. Some, like Snowflake Inc., are opting for the proprietary variety. Taking the opposite side of the debate is Databricks Inc., which advocates building on open-source standardization.

“Open source can pretty much do anything,” said Ali Ghodsi (pictured), co-founder and chief executive officer of Databricks Inc. “We think that open source is a force in software that’s going to continue for decades, hundreds of years, and it’s going to slowly replace all proprietary code in its way.”

Ghodsi spoke with theCUBE industry analyst John Furrier at Supercloud 22, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. During “The Open Supercloud” session, they discussed the advantages and disadvantages of taking an open approach to supercloud.

Data lakehouse sets an open standard for simplifying the data stack

Can open standards deliver the same experience as de facto, proprietary approaches in terms of control, governance, performance and security when it comes to building an abstraction layer that leverages hyperscaler power to deliver a consistent experience to users and developers? Databricks has bet its fortune on the fact that it can.

The company’s data lakehouse platform provides an example of an open-source supercloud in action. Built on a structured and unstructured cloud data lake powered by the hyperscalers, which is made reliable and performant by Delta Lake, the platform provides a common approach to data management, security and governance through its Unity Catalog layer.

“We’re big believers in this data lakehouse concept, which is an open standard to simplifying the data stack and help people to just get value out of their data in any environment,” Ghodsi said.

Around 80% of Databricks’ customer base is on more than one cloud, and they are struggling with the complexity, according to Ghodsi. Reconfiguring data management models over and over to integrate with the different proprietary technologies of the various cloud providers is a time-consuming and difficult task brought about thanks to the ad-hoc creation of multiclouds by default rather than by design — a description given by Dell Technologies Inc.’s Co-Chief Operating Officer Chuck Whitten.

It’s the operations teams that bear the brunt of integrating new technology and making sure it works, according to Ghodsi. And doing it in multiple environments, each with a different proprietary stack, is a tough challenge.

“So, they just want standardization,” he said. “They want open-source technologies. They believe in the communities around it. They know that source code is open so you can see if there are issues with it, if there are security breaches, those kinds of things.”

Completing the journey to automated, AI-infused predictive analytics

Databricks didn’t set out to build a supercloud. The company has a mission to help organizations move through the data/artificial intelligence maturity mode, bringing them to the point where they can leverage the advantage of prescriptive, automated AI/machine learning in the same way that enabled the tech giants to reach the level they are today, according to Ghodsi.

“Google wouldn’t be here today if it wasn’t for AI,” he said. “You know, we’d be using AltaVista or something.”

The continuum starts when a company goes digital and starts to collect data, Ghodsi pointed out. They want to clean it, get insights out of it. Then they move on to using the “crystal ball” of predictive technology. The end comes when a company can finally automate the process completely and act on the predictions.

“So this credit card that got swiped, the AI thinks it is fraud, we’re going to deny it,” he said. “That’s when you get real value.”

The basis of Databricks’ data lakehouse, which falls under theCUBE’s definition of supercloud, started when the company developed the Delta Lake framework in 2019 as a way to help companies organize their messy data lakes. The same year the project was donated to the Linux Foundation in order to encourage innovation. Then, at the start of Databricks’ Data + AI Summit this past June, Databricks removed any differences between its branded Delta Lake and the open-source version by handing the reins of the storage framework to the Linux Foundation.

“What we’re seeing with the data lakehouse is that slowly the open-source community is building a replacement for the proprietary data warehouse, Delta Lake, machine learning, real-time stack in open source, and we’re excited to be part of it,” Ghodsi said.

Potentially, the most important protocol in the data lakehouse is Delta Sharing, according to Ghodsi. The open standard enables organizations to efficiently share large data sets without duplication. And being open source means that any organization can access the functionality to help build a supercloud in the design that works best for their organization.

“You don’t need to be a Databricks customer. You don’t need to even like Databricks,” Ghodsi said. “You just need to use this open-source project and you can now securely share data sets between organizations across clouds.”

Open source has already become the software default, and in the next couple of years, it’s going to be a requirement that software works across the different cloud environments, according to Ghodsi.

“Is it based on open source? Is it using this data lakehouse pattern? And if it’s not, I think they’re going to demand it,” he said.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the Supercloud 22 event:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community 

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK