1

Easy Data Pipeline Creation Drives Data Driven Services

 1 year ago
source link: https://www.gigaspaces.com/blog/creating-data-pipelines
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Key Takeaways

Set up data pipelines from any data source – using a mix of your own connectors OR Smart DIH connectors. This lets you:

  • Build digital services over an increasing number of diverse systems
  • Shorten time to value for new service delivery
  • Lower TCO by saving on connector licensing costs

Open Platform Design Empowers Data Professionals

This blog series so far has focused on two angles of Smart DIH’s Open Platform architecture:

The first blog in the series “How Open Platform Architecture is Reflected in the Data Hub Digitization Layer” discussed how support for the OpenAPI specification lets data professionals easily create and deploy APIs with low code. As a result, developers can share and reuse API definitions and build on each other’s work – all leading to faster and more agile service delivery.  

The second blog in the series covered SQL extensibility: How Smart DIH implements SQL to provide composability and flexibility, empowering data professionals to use familiar skill sets to develop data-driven services quickly. 

Quick reminder: Smart DIH is an operational data hub designed to speed up new data driven digital services and enable rapid development of new business apps by delivering the ‘always fresh – always on’ data that modern applications rely on. 

Smart DIH aggregates multiple back-end systems into a low-latency, scalable, high performance data layer exposing APIs and events. By decoupling systems of record from digital applications, Smart DIH lets enterprises drastically shorten the development and deployment of new digital services. With Smart DIH organizations can rapidly scale to serve millions of concurrent users – no matter which IT infrastructure or cloud topologies they rely on – cloud, on-prem or hybrid.

This third blog in our Open Platform series focuses on the flexibility of the Smart DIH ‘Pluggable Connector Framework’. The Pluggable Connector Framework allows data professionals to set up data pipelines rapidly – using connectors already deployed in their organization, or native Smart DIH connectors. This approach is at the heart of open platform design: Many platforms claim to have connectors that cover all data sources. However, this is rarely true. The Pluggable Connector Framework, on the other hand, connects to multiple data sources with a hybrid set of connectors: third party, home grown or native Smart DIH. This approach lowers costs, shortens time to value and eliminates redundant licensing costs.

Data Pipelines in a Nutshell 

Data pipelines are central to the Smart DIH data integration layer.  Data pipelines are well-defined flows that manage the data journey from a data source into the Smart DIH hosting layer. Data sources may be relational databases, no-SQL databases, object stores, file systems, or message brokers. Data may either be structured or semi-structured, and can be integrated as a stream or in batches. 

As noted, Smart DIH relies on its Pluggable Connector Framework to construct data pipelines from diverse underlying data sources using a hybrid approach: you can bring your own connector, or use a Smart DIH connector. This approach also allows the integration of connectors from multiple GigaSpaces partners to create the ultimate ecosystem for connectors to co-exist.

This design provides singular benefits derived from implementing Smart DIH as a fully pre-integrated solution. These include:

  • Standardization of the data journey, regardless of source and choice of connector. 
  • Real-time performance for event-driven updates of stream-based data pipelines, using frameworks such as Kafka, Flink and, of course, GigaSpaces’ own Space.
  • Rapid data load, required in initialization and recovery scenarios, using both Kafka and Space partitions.
  • Built-in continuous update mechanisms (CDC, incremental batch), enabling the addition of new tables and sources while keeping the data up to date for operational services.
  • Built-in data cleansing capabilities including validation and registration for the purpose of observability and governance. 
  • Built-in reconciliation mechanisms, to support various recovery and schema change scenarios

Unlike other integration methods which rely on SDKs and require programming, the Smart DIH Pluggable Connector Framework offers a declarative language approach that allows users to set up a data pipeline very quickly. The configuration file is written in simple syntax and hooks into any connector that supports Kafka. Ease-of-use does not limit functionality: The configuration file can support complex messaging logic, set CDC (Change Data Capture) rules, and define data modeling rules. 

Connect to Any Data Source, Reduce Costs and Speed Up Time to Value

The Open Platform philosophy behind the Pluggable Connector Framework provides many benefits to IT teams. It  contributes effectively to digital innovation projects designed to increase competitiveness and accelerate digital service delivery.  

  • Fast time to value: The ability to create data pipelines quickly with a simple configuration file, without having to rely on dedicated development teams, shortens time to value and allows organizations to better utilize the skill sets of IT teams. Ultimately, data access services can be exposed as soon as the data pipelines are set up – allowing data teams to deliver value fast. 
  • Lower TCO: By being able to hook into any connector that supports Kafka, Smart DIH lowers TCO for IT teams: They can use connectors already deployed in their organization, saving licensing and subscription fees for custom connectors.
  • Open Ecosystem: The Smart DIH Pluggable Connector Framework offers an open ecosystem that can support non-standard messaging formats used in home-grown or proprietary applications. No matter what the data source or message format – IT teams will still be able to easily create the data pipelines needed to power data access services. 

Many enterprises are facing a fierce competitive landscape. They are under great pressure to deliver new revenue streams, offer superior customer journeys and respond quickly to market demands. Digital modernization driven by modern data access services are key to success. Smart DIH, with its open architecture design, offers the flexibility IT teams need: It easily integrates data from diverse legacy systems on the south end, while quickly exposing data services to consuming applications on the north end. 

The Smart DIH Pluggable Connector Framework is key to helping data and IT teams realize this vision: the ability to use a  mix of connectors offers endless opportunities to expand modernization by building new digital services over an increasing number of underlying systems; the simplicity in which this is achieved with the Smart DIH declarative approach cuts development times and speeds up time to market. From here, the path to business agility, fast service roll-out and the rapid launch of new apps is a downhill ride. 

banner

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK