8

Using functional programming in Java with a Producer-Consumer.

 3 years ago
source link: https://bytes.grubhub.com/using-functional-programming-in-java-with-a-producer-consumer-be89667f1527
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Using functional programming in Java with a Producer-Consumer.

Image for post
Image for post

In complex networked applications like ones at Grubhub, one must program the application to do many operations at once. One of the methodologies we use is concurrent programming. It involves executing multiple instructions in an overlapping time period, via the unit of execution; namely a thread or process. The main challenge while writing concurrent code is managing the results of multiple simultaneous operations. Doing so involves correct interleaving and signaling between threads and coordinated access to shared resources. But concurrent programming has several common pitfalls like deadlocks, race conditions, and starvation.

Post Java version 1.6, the Java Development Kit (JDK) now has numerous classes that allow for process synchronization with ease. Proxy constructs like Futures/Promises were developed for holding supposed results of synchronized concurrent executions. In just a few years, usage of Java concurrent data structures in code has become akin to using a smartphone now — extremely easy to use with all the complexities hidden away.

The Producer-Consumer pattern helps overcome some of the challenges of concurrent programming and simplify it a bit. One could think of it as a tailored solution for multi-process synchronization if implemented correctly.

The producer generates data, and the consumer processes it. They communicate through a shared bounded — or, in some cases, unbounded — buffer. The producer won’t add data into a full buffer, and the consumer won’t try to remove data from an empty buffer. For the purpose of this article, we shall refer to this buffer as a queue.

The logic in the consumer usually dictates what happens to the data read from the queue. The callee who is adding data to the buffer via the producer doesn’t have much control over this behavior. In some cases, it’s just hard coded in the consumer. It would be more ideal if the developer could have complete control of this process and has a way to customize what would happen once the data is read from the queue.

Java introduced functional programming a few years ago. This has allowed one to experiment with a lot of areas, like Java collections, streaming (not lambdas), etc. where functions can be leveraged in ways not known before to simplify code and increase code modularity by using functional constructs like BiConsumers, Consumers, Predicates and Suppliers. This has fundamentally changed how multithreaded code is now written in tandem with functions, which makes something like future chaining or transformation far more lucid in Java.

A use case here at Grubhub

Here at Grubhub, my team has a few REST APIs that serve as a liaison for the menu item and dish category data. Some of these REST endpoints take in a list of menu or dish label IDs and pass them on to the backend for database search, processing, or data export by kicking off a job. We could use a separate endpoint for each job type, but that would require us to maintain the boiler plate code over and over to process this data.

This led to “functionalization” of the processing of our data. We created a producer-consumer like framework, which includes functions at its core.This framework can take in input of type <I> and produce a result <R> by injecting a Function<I,R> into the framework. This helps us maintain a common incoming pattern of data, but still allows us to apply various functional behaviors after input via REST endpoints. Now, the developer can pick and choose the type of data without having to relinquish control of how the data will be processed and returned. Instead, data will be processed based on the function injected.

One can then:

  • Easily swap functions based on a simple select query from Cassandra and returning the data
  • Or the function can be more complex wherein it requires a select, followed by updating an in-memory cache, and publish on a message bus of the object retrieved
  • Or a complex math operation which involves some processing

The possibilities are endless.

The producer/consumer is backed by a linked blocking queue which can be bounded or unbounded. A max number of threads control a fixed thread pool executor, which is used in the consumer to process the queued up data, <I> and return the processed data, <R>.

Below is a snippet of a slightly modified producer-consumer that has the following changes vs. a standard producer-consumer:

  1. The data <I> can be added only once to the producer and not as a contiguous stream.
  2. The developer now has complete control over how the data is processed.
  3. In unit tests, you can now concentrate entirely on the logic in the function and avoid unit testing the whole threading bit, as it’s all done seamlessly with a collection of return type <R>.

Most of the boilerplate management is taken care of so that the developer can concentrate on the business logic inside the function. Thread management, future resolution for timeouts, future evaluation for input data that can cause exceptions, etc. is all inbuilt. One also doesn’t need to worry about cleanup like shutting down executors, etc. as it’s automatic upon return of the data of type <R>.

The queue size is configurable and kept unbounded for use cases where the amount of data to be processed is not known beforehand.The number of worker threads are also configurable for CPU intensive tasks in the function.

Code examples

To show how flexible this framework is, let’s look at two examples using the above code. The first example shows menu item IDs being passed in as a collection and the function calling an existing method to find the MenuItemDTO values for each ID concurrently being returned by doing a search in Cassandra using four consumer threads with a max timeout of 2000 milliseconds.

This second example allows menu item IDs marked as inactive or eligible for delete by another job, to be deleted from the DB and published to a message bus for other systems to consume, before being deleted.

Summary and Final Thoughts

The introduction of functions in Java 8 allows developers to add functions to object oriented code to solve complex design problems where OOP alone wasn’t sufficient. Functional programming allows for composing functions and avoiding shared state and mutable data. Many Java libraries from Apache and Google have rapidly evolved in making their APIs more functional. For example, future transformations by adding input/output functions allows concurrent programming constructs to be further enhanced. Another example is RxJava, which is a framework that merges reactive and functional programming paradigms while being built on the observer pattern. These examples help demonstrate the increased recognition of the value of functional programming with monolith programming paradigms that previously did not allow them, and has substantially increased the coding firepower at the disposal of the developer.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK