Understanding Streams in Redis and Kafka

Comparing the Approaches of Kafka and Redis to Handling Streams

15. Aug 2022

This post was adapted from a new book “Understanding Streams in Redis and Kafka, A Visual Guide.” Stream processing is one of the more complex systems to understand so this book attempts to explain these concepts visually with more than 50 detailed graphics and code samples from both Redis Streams and Kafka. You can download a complimentary digital copy here.

Apache Kafka is open-source (Apache License 2.0, written in Scala) and a leading distributed streaming platform. It’s a very feature-rich stream processing system. Kafka also comes with additional ecosystem services such as KsqlDB and Kafka Connect to provide for more comprehensive capabilities.

Redis is an open-source (BSD3, written in C) in-memory database, considered to be the fastest and most loved database. It’s also the leading database on AWS. Redis Streams is just one of the capabilities of Redis. With Redis, you’ll get a multi-model, multi-data structure database with six modules and more than 10 data structures.

How messages (event data) are stored

Although their storage is similar, Kafka and Redis Streams have different ways of identifying each message. In Kafka, each message is given a sequence number that starts with 0. But each message can only be partly identified by its sequence number. That’s because of another concept called a “partition” that we’ll get into later.

In Redis Streams, each message by default gets a timestamp as well as a sequence number.The sequence number is provided to accommodate messages that arrive at the exact same millisecond. So if two messages arrived at the exact same millisecond (1518951480106), their ids would look like 1518951480106-0 and 1518951480106-1.

Fig. 1: How messages look in Kafka and Redis Streams

Creating streams

In Kafka, you create what’s called a “topic”. You can think of this as the name of the stream. However, in Kafka, you also need to understand four key concepts.

Partition: You can think of it as a file on the disk.
Broker: You can think of the actual server.
Replication Factor: The number of duplicate copies of the messages you want to keep.
Zookeeper: This is an additional system that you need to use in order to manage Kafka.

We’ll get into all these in a bit but for now let’s assume you have one partition, one broker, and one replication factor.

Fig. 2: How messages look in Kafka for topic Email with one broker, one partition, and one replication factor

Note: The example below (Picture 2a) shows how these would look in a Kafka cluster. We’ll discuss that later, but for now just imagine that there is only one broker.

Fig. 2a: A Kafka cluster with three brokers (servers), two topics (Email and Payment), where Email-topic has three partitions that are spread across three brokers (10, 11, and 12) and Payment topic has two partitions that are spread across two brokers (11 and 12)

Angular 14 Hands-on Guide

The Angular 14 framework is the most comprehensive and most hotly anticipated update by the community in a long time. There are exciting features for different user groups, so there should be something for everyone. An update is therefore worthwhile, even if there are one or two incompatible changes to consider. Karsten Sitterberg introduces you to the innovations Angular 14 brings for developers.

In Redis, you simply create a stream and give it a key. Note that all the data within this stream is part of this single key (“Email’). Note also that this key and its stream just resides along with other keys and data structures. Redis allows for a number of data structures. A stream is just one of them. (See Picture 3.)

Fig. 3: How messages look in Redis for an Email stream

Adding messages

Kafka has a concept called “producers.” These are responsible for sending messages. They can also send messages with some options such as acknowledgments, serialization format and so on.

Consuming messages

Both Kafka and Redis Streams have the concepts of consumers and consumer groups. We’ll cover just the basics first.

With Kafka

In Kafka, the following command reads all the messages in the Email topic. The “bootstrap-server” is the main Kafka server. The “--from-beginning” flag tells Kafka to send all the data from the beginning. If we don’t provide this flag, the consumer will only retrieve messages that arrive after it has connected to Kafka and started to listen.

Note: The above consumer client will continue to wait for new messages in a blocking fashion and will display them when they arrive.

With Redis Streams

In Redis Streams, you have two main options:

Notes:

If you use “Email $”, then it would get only new messages from the “Email” stream. That is, “XREAD BLOCK 0 STREAMS Email $”
You can use any other timestamp id after the stream name to get messages after that timestamp id. That is, “XREAD BLOCK 0 STREAMS Email 1518951482479-0”

Approaches to scaling consumption

You just saw the basics of producers and consumers in both Kafka and Redis. Now, let’s dig in and see how these streaming services scale consumption.

Single partition and multiple consumers

Scenario: Let’s imagine you have three emails that need to be processed in no particular order by three email processors (consumers) so you can get the job done in one third the time.

In Kafka, let’s say you connected all three consumers to the Email topic. Then all three messages are sent to all three consumers. So you end up processing duplicate messages. This is called a “fan out”.

Fig. 4: A “fan out” in Kafka when multiple consumers connect to a single topic

Note: Although it doesn’t work for this scenario, it works fine in the chat messenger clients where you can connect multiple users to the same topic and they all receive all chat messages.

It works exactly like that in Redis Streams as well.

Fig. 5: A “fan out” in Redis Streams

This is just the beginning of our look at how Redis Streams and Kafka work and they implement the same concepts. Additional topics covered include scaling consumption, messaging acknowledgement, and the role of clusters. Hopefully by reading this book you’ll be able to build a proof-of-concept or begin the certification process for Redis Streams or Kafka.

Comparing the Approaches of Kafka and Redis to Handling Streams

Comparing the Approaches of Kafka and Redis to Handling Streams

How messages (event data) are stored

Creating streams

Angular 14 Hands-on Guide

Adding messages

Consuming messages

With Kafka

With Redis Streams

Approaches to scaling consumption

Single partition and multiple consumers

Recommend

暂停交易，暂停存取款！又一币圈平台爆雷，涉及用户超200万

当客户变成了对手，中小云厂商怎么活？

约谈、封禁、关停……高压整治下，虚拟币炒作乱象为何屡禁不止？

聊聊Mybatis的缓存的其他装饰者

长篇图解java反射机制及其应用场景

雪花算法详解_大鱼的技术博客_51CTO博客

#yyds干货盘点# leetcode算法题：回文链表

从利益交换到品牌即服务：关于NFT与品牌结合新场景的范式探讨

机模被曝光苹果iPad 10将彻底移除3.5mm耳机插孔

干货技巧|关于Redis的16个使用技巧

About Joyk