3

Hard-to-debug unhandled rejection cases

 6 months ago
source link: https://advancedweb.hu/hard-to-debug-unhandled-rejection-cases/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Hard-to-debug unhandled rejection cases

3 examples when an unhandled rejection happens

hard-to-debug-unhandled-rejection-cases
Author's image
Tamás Sallai
8 mins
Photo by Pixabay: https://www.pexels.com/photo/question-mark-on-chalk-board-356079/

Unhandled rejections

In one of my projects I'm building a cache that works across workers. With that library, I can turn any function into a cached one where the cache key is calculated based on the arguments and if a given key is already finished then the stored result will be returned.

One core synchronization for this type of caching is that the function should only be called once for a given set of arguments. That means if the function is already running then the caller will just wait until the execution is complete.

It is based on an object that keeps track of in-progress calculations. A simplified version:

hard-to-debug-unhandled-rejection-cases

Then I wanted to support worker threads as well. This means that if any process is currently calling the function then all parallel calls will wait for the results, even if a worker thread is doing the calculation. This mechanism builds on BroadcastChannel as threads don't share memory. A BroadcastChannel is a cross-context messaging port that enables global messaging.

But the requires a rather complex messaging protocol between the workers and the main thread. For that, I implemented a coordinator that is run on the main thread and handles workers' requests to start tasks.

When a worker wants to call the function, it checks first with the coordinator that a call with the arguments are not in progress. If it is, then the worker needs to wait for the finished signal, if not, then the coordinator create an entry in the inProgress object and waits for the worker to report that it finished the function.

A simplified code for this:

hard-to-debug-unhandled-rejection-cases

This implementation works, for example, starting the task in a worker then calling the function locally won't call it twice:

I was happy with the implementation, up until I started writing test cases for rejections. What happens if the function rejects? In that case, the worker will send a finish_error event, the coordinator rejects the inProgress Promise, and all the calls will be rejected as well, just as expected.

What I did not expect to see is unhandled rejections. And as I subsequently found out, tracking down these rejections is quite challenging and often surprising.

This article describes the three causes of unhandled rejections I encountered while working on this project. Each has different root causes, and posed different challenges.

Case #1: No reject handler

Let's start with the one that comes with the fewest surprises! If nothing handles a rejection, then it becomes an unhandled rejection. While it seems trivial, it still bit me.

Usually, rejections behave similarly to exceptions: they go up chain of async functions. This is why I hardly encounter this problem: except for a few forgotten await, it never happens.

For example, deleting a file but forgetting the await produces an unhandled rejection:

But adding an await everywhere usually solves this problem:

In this case, the Promise returned by fs.rm is awaited so the async function will be rejected if it rejects.

So, what went wrong in my use-case?

When a worker calls the function with some arguments the coordinator creates a Promise. This makes it easy for local calls to wait for the result: simply return this Promise:

That means depending on how many postTask calls a key gets, the Promise will be used zero or more times. The problem case here is the zero. What if only the worker is running the function? In that case, the inProgress[key] Promise will be rejected but without anything to handle it.

hard-to-debug-unhandled-rejection-cases

The solution is rather simple after figuring out the cause: make sure that at least one rejection handler is always attached:

Case #2: Promise.finally

When a worker wants to start working on a task, it needs to inform the coordinator about that with a start message. Then it receives either a startack so that it can call the function, or an inprogress so that another thread is already calling the function. After the inprogress, the worker then needs to wait for a finished message telling it that the result is ready.

hard-to-debug-unhandled-rejection-cases

This is sent by the coordinator:

The above implementation is wrong. If the function call rejects there will be an unhandled rejection:

Why is an unhandled rejection throw there? It turns out that if the Promise is rejected then the one returned by finally is also rejected. And since it's not handled, it becomes an unhandled rejection.

The solution? Make sure that it can't reject:

It is usually not a problem as the Promise is usually returned and awaited on.

Case #3: Late return

The third one I encountered during writing test code for the coordinator.

In the library, when a task is posted the code first reads the filesystem to see if the result is already saved there. If not, then it proceeds with calling the function.

A simplified version:

In the tests I wanted to control the series of events. For that, I usually use the Promise with the resolve/reject functions extracted, for which the Promise.withResolvers syntactic sugar is coming.

This works fine when the Promise is resolved. But when it rejects, it raises an unhandled rejection:

The interesting part is that the result is properly rejected, and before rejection there is a catch handler attached to it. So, where the unhandled rejection comes from?

The problem is the order of operations here. Since the postTask does not immediately call the function, the reject() runs first. In a sizeable codebase it was not easy to find this, but putting the two parts next to each other makes it more visible:

In the example, the setTimeout(1) delays calling the fn() so that reject() runs before that. Without a rejection handler, it will raise an unhandled rejection.

hard-to-debug-unhandled-rejection-cases

To solve it, I needed to make sure that the function was already called when doing the rejection:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK