25

Improving Python exception chaining with raise-from

 4 years ago
source link: https://blog.ram.rachum.com/post/621791438475296768/improving-python-exception-chaining-with
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

This is going to be a story about an esoteric feature of Python 3, and how I spent the last few months reviving it and bringing it into the limelight.

Back in the yee-haw days of 2003, Raymond Hettingerwrote an email to the python-dev mailing list, sharing an idea for improving the way that Python handles exceptions that are caught and replaced with other exceptions. The goal was to avoid losing information about the first exception while reporting the second one. Showing the full information to the user would make debugging easier, and if you’ve followed my work before, you know there’s nothing I love better than that.

That idea was polished and refined by many discussions on python-dev. A year later, Python core developerKa-Ping Yee wrote up a comprehensive PEP that was then known as PEP 344, later to be renamed toPEP 3134. That idea was detailed there, with all the loose ends, potential problems and solutions. Guido accepted the PEP, and it was implemented for the infamous Python 3.0, to be used… By no one. For a long time.

If there’s one thing I don’t miss, it’s waiting 10 years for the Python ecosystem to adopt Python 3. But finally, it happened. Almost all the packages on PyPI support Python 3 now, and getting a job writing Python 3 code is no longer a luxury. Only a few days ago, NumPy finally dropped Python 2 support. We live in good times.

When a modern Python developer catches an exception and raises a new one to replace it, they can enjoy seeing the complete information for both tracebacks. This is very helpful for debugging, and is a win for everybody.

Except… For one thing.

Two cases of exception chaining

There was one interesting detail of PEP 3134 that was forgotten: It has to do with the question, “What does it mean when one exception is replaced with another? Why would someone make that switcheroo?”

These can be roughly divided into two cases, and PEP 3134 provided a solution for each case.

The first case is this:

“An exception was raised, we were handling it, and something went wrong in the process of handling it.”

The second case is this:

“An exception was raised, and we decided to replace it with a different exception that will make more sense to whoever called this code. Maybe the new exception will make more sense because we’re giving a more helpful error message. Or maybe we’re using an exception class that’s more relevant to the problem domain, and whoever’s calling our code could wrap the call with an except clause that’s tailored for this failure mode.”

That second case is quite a mouthful, isn’t it? It didn’t help that the first case was defined as the default. The second case ended up falling by the wayside. Most Python developers haven’t learned how to tell Python that the second case is what’s happening in their code, and to listen when Python is telling them that it’s happening in code that they’re currently debugging. This resulted in a Catch 22 situation, not that different from the one that slowed down Python 3 adoption in the first place.

Before I tell you what I did to break that Catch 22, I’ll bring you into the fold and show you how to make that feature work in your project.

Exception causes, or `raise new from old`

I’m going to show you both sides of this feature: How to tell Python that you’re catching an exception to replace it with a friendlier one, and how to understand when Python is telling you that this is what’s happening in code that you’re debugging.

For the first part, here’s a good example from MyPy’s codebase:

try:
                  self.connection, _ = self.sock.accept()
              except socket.timeout as e:
                  raise IPCException('The socket timed out') from e

See the from e bit at the end? That’s the bit that tells Python: The IPCException that we’re raising is just a friendlier version of the socket.timeout that we just caught.

When we run that code and reach that exception, the traceback is going to look like this:

Traceback (most recent call last):
                File "foo.py", line 19, in 
                  self.connection, _ = self.sock.accept()
                File "foo.py", line 7, in accept
                  raise socket.timeout
              socket.timeout
        
              <b>The above exception was the direct cause of the following exception:</b>
        
              Traceback (most recent call last):
                File "foo.py", line 21, in 
                  raise IPCException('The socket timed out') from e
              IPCException: The socket timed out

See that message in the middle, about the exception above being the direct cause of the exception below? That’s the important bit. That’s how you know you have a case of a friendly wrapping of an exception.

If you were dealing with the first case, i.e. an exception handler that has an error in it, the message between the two tracebacks would be:

During handling of the above exception, another exception occurred:

That’s it. Now you can tell the two cases apart.

What I did to push this feature

I found that almost no one knows about this feature, which is sad, because I think it’s a useful piece of information when debugging. I decided I’ll do my part to push the Python community to use this syntax.

I wrote a little script that uses Python’sast module to analyze a codebase and find all instances where this syntax isn’t used and should be. The heuristic was simple: If you’re doing a raise inside an except then in 99.9% of cases you’re wrapping an exception.

I took the output from that script and used it to open PRs to a slew of open-source Python packages. Some of the projects I fixed are: Setuptools, SciPy, Matplotlib, Pandas, PyTest, IPython, MyPy, Pygments and Sphinx. Check out my GitHub history for the full list.

I then added a rule to PyLint, now known as W0707: raise-missing-from. After the PyLint team makes the next release, and the thousands of projects around the world that use PyLint upgrade to that release, they will all get an error when they fail to use raise from in places they should.

Hopefully, in a few years’ time, this feature of Python will become more ingrained in the Python community.

What you can do to help

Do you maintain a Python project that already dropped Python 2 support? Install the latest version of PyLint from GitHub. You can do this in a virtualenv if you’d like to keep your system Python clean. Run this to install:

pip install git+https://github.com/PyCQA/pylint

Then, run this line on your repo:

pylint your_project_path | grep W0707

You’ll get a list of lines showing where you should add raise from in your code. If you’re not getting any output, your code is good!

#planetpython #python #open-source #programming #debugging

yYNrAbZ.png!web

Written on June 24th, 2020 by

Ram Rachum

I’m a software developer based in Israel, specializing in the Python programming language. I write about technology, programming, startups, Python, and any other thoughts that come to my mind.

I’m sometimes available for freelance work in Python and Django . My expertise is in developing a product from scratch.

Older post


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK