14

Databricks targets data pipeline automation with Delta Live Tables

 2 years ago
source link: https://www.infoworld.com/article/3656909/databricks-targets-data-pipeline-automation-with-delta-live-tables.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Databricks targets data pipeline automation with Delta Live Tables

By Anirban Ghoshal

Senior Writer,

InfoWorld | Apr 7, 2022 5:29 am PDT

Databricks has unveiled a new extract, transform, load (ETL) framework, dubbed Delta Live Tables, which is now generally available across the Microsoft Azure, AWS and Google Cloud platforms.

According to the data lake and warehouse provider, Delta Live Tables uses a simple declarative approach to building reliable data pipelines and automatically managing related infrastructure at scale, essentially reducing the time taken by data engineers and scientists on complex operational tasks.

“Table structures are common in databases and data management. Delta Live Tables are an upgrade for the multicloud Databricks platform that support the authoring, management and scheduling of pipelines in a more automated and less code-intensive way,” said Doug Henschen, principal analyst at Constellation Research.

By making authoring low-code and declarative through SQL-like statements, Databricks is looking to lower the barriers to entry for complex data work such as keeping ETL pipelines healthy.

Green IT: The color of money
0 seconds of 21 minutes, 50 secondsVolume 0%

“The bigger the company, the more likely it is to be struggling with all the code writing and technical challenges of building, maintaining and running myriad data pipelines,” Henschen said. “Delta Live Tables is aimed at easing and automating much of the coding, administrative and optimization work required to keep data pipelines flowing smoothly.”

Early days for the data lakehouse

However, Henschen warned that it is still early days for combined lake and warehouse platforms in enterprise environments. “We’re seeing more greenfield deployments and experiments for new use cases rather than straight up replacements of existing data lakes and data warehouses,” he said, adding that DLT has competition from the open source Apache Iceberg project.

“Within the data management and, specifically, the analytical data pipeline arena, another emerging option that’s getting a lot of attention these days is Apache Iceberg. Tabular, a company created by Iceberg’s founders, is working on delivering the same benefits of low-code development and automation,” Henschen said.

Iceberg got a major endorsement this week, with Google Cloud embracing this open source table format as part of the preview of its new combined data lake and warehouse product, called BigLake.

Databricks claims that DLT is being used by 400 companies globally already, including ADP, Shell, H&R Block, Bread Finance, Jumbo and JLL.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK