4

CDAP

 3 years ago
source link: https://cdap.io/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
CDAP

The Data Analytics Platform

A 100% open source, integrated framework that accelerates application development for data analytics

Get started in the Cloud

Run CDAP on any major public cloud provider including Amazon Web Services, Microsoft Azure and Google Cloud Platform

Why CDAP

CDAP lets developers, business analysts and data scientists focus on insights, analytics and business value instead of wrestling with infrastructure, and integration.

icon-1.svg

Reduced complexity

CDAP's easy to use abstractions over complex technologies shift focus to insights from infrastructure and integration.
Hide
icon-2.svg

Increased velocity

With an extensible framework and reusable templates, CDAP accelerates time to value and breaks down silos, so you can build once, run anywhere.
Hide
icon-3.svg

Increased flexibility

CDAP is 100% open source, portable and extensible. It integrates with latest Big Data and Cloud technologies.
Hide
icon-4.svg

Improved visibility

CDAP gives greater visibility on your data, by allowing to search metadata, and providing insight into data lineage.
Hide

CDAP features

Rapid development

Developer SDK and APIs with abstractions over common data processing patterns; Sandbox mode, programmatic and UI driven debugging; In-memory mode and testing framework to simplify testing; Support for cutting edge Cloud, Apache Hadoop and Apache Spark technologies.

Hide

Enterprise ready

Metadata repository with automatic technical and operational metadata capture; Business metadata annotations; Data discovery through search based on metadata; Data governance with dataset and field level lineage and auditing; Integration with enterprise security systems.

Hide

Seamless operations

REST APIs and CLI for every interaction; Time and process based scheduling; Standardized logs and metrics for all execution environments.

Hide

Portable runtime environments

Build once, run anywhere through portability across runtime environments such as Apache Hadoop YARN and Docker.

Hide

Extensible and reusable

Templates and blueprints for common use-cases; Hub for sharing pre-built plugins, applications and solutions; Extensible APIs for security, metadata, runtimes and storage.

Hide

Hybrid and multi-cloud

Interoperability across on-premises and Cloud environments; Support for all major public cloud providers such as Amazon Web Services, Microsoft Azure and Google Cloud Platform.

Hide

Accelerators

accelerators-details-pipeline.jpg

Pipelines

Pipelines provides an easy-to-use graphical data integration interface to bring together data from a myriad of different sources and define transformations visually.

Learn more
accelerators-details-wrangler.jpg

Wrangler

Wrangler allows you to visually and interactively cleanse and prepare raw data, with the aim of making it consumable for further processing. It provides a standardized UI driven interactive flow that takes the pain out of preprocessing tasks for data engineering, data science and data analysis. Learn More

Learn more
accelerators-details-analytics.jpg

Analytics

Analytics provides a simple, interactive, automated interface for users to easily develop, train, test, evaluate and deploy their machine learning models.

Learn more
accelerators-details-rules.jpg

Rules

Rules Engine provides a way for business analysts to create and manage a knowledge base of data transformation rules that need to be automatically applied to your data.

Learn more

Partners


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK