Understanding GenStage back-pressure mechanism
source link: https://dev.to/dcdourado/understanding-genstage-back-pressure-mechanism-1b0i?
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Introduction
Back-pressure is a technique utilized to prevent an application or a piece of software of using more resources than there are available on a given infrastructure.
GenStage is an Elixir library to build complex processes divided by steps (or stages) that share data between them. This is the core behaviour used on Broadway, a multi-stage data ingestion backed on message queue systems such as Kafka, RabbitMQ and others.
The problem
Suppose you've built a news sharing system which is composed of three main components:
- Data ingestor (A): constantly searches and downloads tweets that contains hashtags of interest, such as #computerscience, #architecture and #programming.
- Republisher (B): takes a tweet content, renders it on several different formats (HTML, Markdown, PDF and so on) and then publishes on different platforms.
- Reporter (C): writes to database the timestamps of each publishing and also exports metrics to be analyzed on the future.
The process being executed on component A is simple and takes only a single request to be done: a GET on Twitter's API. For this action we will account an imaginary measurement of 1N amount of energy to get done.
Now that a tweet content is retrieved and delivered to component B, a more complex process starts: it is required to transform the data acquired on several different formats and then make several other requests to external services (which may or not fail, timeout or take several more amount of time than usual). For this step then we will account 10N amount of energy taken.
Component C executes a simpler process as well since it only does a write operation on database and then create telemetry data that will be polled later. That's 2N amount of energy on our example.
As you can imagine, this scenario is problematic because component A generates input for component B on a speed ratio that it can't absorb, and that generates a overflow on B's execution queue.
The solution
GenStage strategy to apply back-pressure is to invert the flow direction from the producer to the consumer and so consumers now control the velocity and amount of data transmitted.
Component C starts with its execution queue empty and then asks Component B (which is a consumer-producer) to produce a piece of data.
Component B which also has it's execution queue empty now asks Component A to produce a piece of data and only then the Tweeter API is called!
The amount of data produced is back-pressured so no queues gets overflowed and that's how you have a healthy workflow on Elixir using GenStages.
Recommend
-
111
-
86
Self-Update Mechanism for Go Commands Using GitHub go-github-selfupdate is a Go library to provide a self-update mechanism to command line tools. Go does n...
-
104
-
75
README.md
-
63
In an internal document distributed to Apple Authorized Service Providers, obtained by MacRumors from multiple reliable sources, Apple has confirmed...
-
56
D ata binding is one of the most important features in Angular. Data binding in Angular works by synchronizing the data in the components with the UI so that it reflects the current value of the data. To...
-
6
Tuning Spark Back Pressure by Simulation Dec 3, 2016 | simulation Spark back pressure, which can be enabled by setting spark.streaming.bac...
-
7
Big data is the buzzword all over lately, but fast data is also gaining traction. If you are into data streaming, then you know it can be tedious if not done right and may result in data leaks/OutOfMemory exceptions. If you are building a ser...
-
5
The Antikythera Mechanism Episode 11 - Inscribing The Back Plate - Part 1
-
4
Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch Feb 9, 2023 by Sebastian Raschka In this article, we are going to understand...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK