Save the code or save the tests?

2020-11-29 by qntm

A while back during a training session at work, our instructor pitched us a hypothetical question:

Your data centre is on fire, and all your application code is on one drive, and all your tests are on the other drive, and you can only rescue one. What do you do?

Save the application code.

Save the tests.

More recently, out of curiosity, I pitched the same question as a poll on Twitter. On Twitter, roughly 80% of respondents said that they would save the application code and 20% said they would save the tests.

People gave differing reasons for their decisions. Predictably, many respondents said to let it all burn, and some people attempted to justify that choice by asserting that they had backups. But "let it all burn" wasn't one of the provided options and these responses didn't show up in the results, so we can safely ignore those opinions for now.

Most of the remaining respondents seemed to make decisions based on their own assumptions about how detailed and specific those tests actually were. Some people said that it was impossible to make the decision without knowing more about the tests. Some people said to save the application code because the other drive, containing their tests, would be blank, or at least contain nothing of value. Some people said they would save the application code because although the tests might exist, and even be quite robust, they would not specify enough of the application's behaviours to make recreating the application possible. Some added that they would try to rebuild the tests from the saved application codebase. This, of course, also involves significant assumptions about the testability of the application.

But some people, of course, asserted that their tests would be exhaustive enough that rebuilding the application from the test code would be easier.

In short, a big old "it depends".

So, this all made for an entertaining discussion and it generated some interesting results, both during the original training session and on Twitter.

The instructor's point in asking this question was to make an assertion about the virtue of test-driven development and having good code coverage. If you have sufficiently detailed unit tests, we were told, then the structure of the application is so clearly specified by those unit tests that reconstructing the original application from the tests is more or less trivial. It is, in fact, the path of least resistance. So, you should rescue the unit tests. And, you should follow proper TDD practices and always have good code coverage. If you do this, it doesn't matter if the application is blown away, you can start over very easily.

In the training session, we all found this to be very convincing and insightful, and we nodded in agreement, understanding the lesson.

And then I thought about it for a minute.

And I asked myself a different question:

When would this hypothetical, or anything like it, ever happen?

Clearly, the actual scenario as presented is ludicrous. Why would the tests and the application code be kept on two separate discrete drives, side by side in the same data centre? Why would you be in the data centre at that exact time, and be confronted with those specific drives and a choice of only rescuing one? If this scenario actually occurred, the original respondents who deflected the question had the right answer all along. It's a fire. You just get out. Your skin is worth more than any piece of software, and there are assuredly backups, or at least local copies on other developers' machines.

So, this data-centre-on-fire scenario is clearly intended as a proxy for some other, real scenario. But what is that real scenario?

Why is it useful or desirable to be able to completely reconstruct an application from its unit tests? When, and how, would you ever end up in a situation where the application had been destroyed and the unit tests had survived?

I haven't fully figured that out.

There are plenty of real cases where tests get "lost" because of neglect. Or where they never get written at all. I guess this is because of a similar choice to the fire scenario, where you have finite software development resources, and you don't have time to do everything you want to do, so something is going to be neglected.

But if that's the scenario, are we saying that given the choice between (1) maintaining the application and (2) maintaining the tests, we should always maintain the tests? Because, if the tests are rigorous enough, we can just pull a lever at any time and use those tests to produce a working application?

This can't be it. It's true that development time is usually unevenly split between application code and tests (and a dozen other things). And it's true that usually the balance could benefit from being shifted in the direction of the tests. But you can't just have tests and no application. You can't just write the tests for a new feature and declare the feature to be completed and delivered. Can you?

(There are narrow and rare scenarios where you might be developing a standardised test suite for some protocol or specification, where the test suite itself is the product, but that's a whole other ball game from conventional software development. I would call it off-topic.)

So, as entertaining as these discussions have been, what are we actually learning here? Anything?

We may be accidentally learning the opposite of the intended lesson. Slavish adherence to TDD, and the maintenance of exhaustive unit tests, such that if the application code goes missing it can be effortlessly regenerated, is pointless, because that never happens.

Truthfully, I think robust unit tests have great value, but this Allegory Of The Conflagrating Drives doesn't illuminate that value at all. I don't think unit tests need to be this exhaustive. I think TDD has value, but it's just a tool, not a way of life, and more often than not it should be left in the toolbox.

That said, I do think that once you get to the point where you have to make a decision between looking after the application code and looking after the tests, something has already gone wrong. They should be inseparable, kept together on a single allegorical drive. We should refuse to acknowledge a distinction between the two. We should refuse to accept, or maintain, one without the other.

So, personally, I end up responding to the dilemma with something helpful like this:

Your data centre is on fire, and all your application code is on one drive, and all your tests are on the other drive, and you can only rescue one. What do you do?

Save the application code.

Save the tests.

[sagely] I would simply never allow this situation to occur.

There! I am one of the people whose opinions can be safely ignored, exactly as it should be.

Save the code or save the tests?

Save the code or save the tests?

Recommend

没有 NAS 好路由器， DNS+去广告用什么方案？

Go 语言高性能编程

[Java 并发]为什么启动线程时使用 start 而不是 run ?

Gopher Dinner 第 1 期结束，简单聊聊

B站微博，终有一战

华辰连科发布AI智能网关整体解决方案

坐标上海，电信来电说升 5G 融合套餐，大家看看是否核算

高效易用的IO库【Okio应用篇】

今天跌了这么多？大家还有钱加仓吗？

[日经] 那些所谓的 AI 音箱真的不偷听吗？

About Joyk