How does science work? It’s all about clones!

The role of causal inference and statistical reasoning in science

Every good scientific experiment that seeks to establish how something works — and not just that it does work — is about making clones.

Bear with me.

Image source: Pixabay.

Thinking scientifically

Let’s arm wrestle! To prepare, you drink a dram of Strength Potion (TM) while I drink filtered water. Then you defeat me without breaking a sweat. You won because of the potion, right?

Nope, we’d offend science itself if we jumped to that conclusion. After all, we’re different people with different arms. Since we could come up with a thousand potion-free explanations for your victory, we’d look foolish claiming we know anything at all about the potion’s efficacy.

But what if I arm-wrestled my genetic clone? She drinks the potion and I don’t. Then she wins effortlessly. Is it the potion?

Image rights belong to the author.

Now the Strength Potion’s role in my crushing defeat starts to look more plausible (we’ve controlled for genetic differences) but I might still gripe about how unfair it is that she was brought up on protein shakes in a lab while I was brought up on coffee in an extended kindergarten. We’re different after all, so let’s not blame the potion just yet.

What if she’s a magical clone — an exact copy of me blinked into being right before the potion? In that case, the causal effect of the potion appears even more plausible. And yet, if I’ve got my nitpicker hat on, I’d point out that she and I are seated in different spots… maybe she has an advantage because the chair heights are different?

And that’s scientific thinking right there! To anticipate the most fearsome nitpickers, you must become your own worst critic and set up the tightest conditions you can so that all alternative explanations are eliminated in advance and the only difference between the treatment and control conditions is, well, the treatment itself (in this case, the Strength Potion). As long as there are still other explanations, your experiment is weak and your scientific opponents can arm-wrestle it into oblivion. You've not yet earned your license to use the word "because" in polite discourse.

To anticipate the most fearsome nitpickers, you must become your own worst critic.

So, what can we do to learn whether the potion works? Perhaps we should clone me and the chairs?

Causal inference

Since actual cloning is still solidly in the realm of sci-fi, let's do the next best thing and use randomization to take care of our problems: we’ll run many rounds of the experiment where we’ll flip a coin to see who gets which chair each time, thereby statistically controlling for the effect of the chair.

Large, randomly-selected groups are essentially statistical clones of one another.

Large, randomly-selected groups are essentially statistical clones of one another and the idea is that the only (statistical) difference between them is the treatment. Then if there’s a difference in results, we can blame the treatment for it. Ta-da! Causal inference achieved!

That’s the core logic behind the cleanest version of causal inference (the ability to make solid claims about what causes what) available to scientists. Anything else is a bit of a reach.*

Cautious inference

But watch out. We can conclude that the difference is due to our experimental treatment, but what if we misunderstood our treatment?

After my clone and I battle it out over infinite randomized trials, you might start thinking that this potion is indeed special. Hold your horses! We didn’t isolate which aspect of the treatment worked. Perhaps it has nothing to do with the active ingredients of the potion itself and everything to do with the sugar content of the mixture. Or perhaps the real reason my clone outperformed me is that I felt demotivated at the prospect of competing with someone who just guzzled a supposed “magic potion.”

Science is about excluding alternative explanations. A scientist’s job is to think of and test all alternative explanations that might throw a spanner in the works. They have to relish attacking their favorite theory instead of defending it.

We can’t know unless we perform our experiment with a better placebo - everything but the active ingredient. Then you’ll find something else to nitpick and I’ll have to modify the experiment some more to exclude your clever alternative explanations. Welcome to science!

But wait! Even if we isolate this specific drink as the magic potion that makes my clone stronger than me, don’t rush off to buy stock in this strength potion company!

All you’d know from my bouts with my clone is this potion works for me (it might not work for you) and that it works in the context of arm-wrestling (it might not help me with my most hated exercise: pushups). Don’t extrapolate; you have no evidence it works for anyone or anything else.

To extend the findings to the general population, you’d have to take large random samples that are representative of humankind and run the same experiment. (Not, I repeat NOT, a group of white males who all live near one another, the misguided way plenty of medicine has been " tested". If you want to claim something works for everyone, you need to represent everyone in your sample.)

Can’t make physical clones for your random participants to battle? Then you’d need even larger sample sizes to model the variability that has nothing to do with potion so you can isolate the potion’s special effect.

Want to extend your findings beyond arm wrestling and claim you have a “strength potion”? Then you’d have to define the criteria for a general strength potion. Enhanced performance on all physical tasks typically performed by humans? Okay, then you’d need to observe humans to get a list of the tasks they typically perform, randomly select a large set of these, and run the experiment with those large sample sizes again for each task. Or you’d keep your previous groups and make each person perform multiple days of potion/not-potion trials with different tasks.

I won’t even get into problems of getting the dose right (more trials!) and figuring out how long the magic potion lasts (is it really out of your system in a day?).

Science is beautiful, hard, and frustrating. Especially when it involves learning about humans.

And so you begin to see why science is hard. We started with a very simple question — does this potion actually make people stronger? — and ended up with an experimental design bearing a price tag that can feed a small country. Understandably, society would likely stop short of funding a bulletproof conclusion in the direction of this strength potion. We scientists would have to ask our esteemed colleagues to assume the factors we failed to control don’t matter, but that’s a topic for a whole other blog post (which you can find below).

Why do we trust scientists?

Now’s a good time to rethink our assumptions about fact and fiction

kozyrkov.medium.com

Science is beautiful, hard, and frustrating. Especially when it involves learning about humans.

If I were me (or my identical clone), I’d set my expectations accordingly. Be mindful of every study’s caveats and remind yourself not to jump to conclusions beyond what was tested. And if you’re too lazy to read the details of what was actually tested, don’t take the summary version you read too seriously.

Summary: Causal inference in a nutshell

Scientific experiments are all about making believable clones — two identical items (or, more typically, two statistically identical groups) — and then applying two different treatments to them (like potion versus no potion). Or, if we want to get fancy and study multiple factors at once, we’ll use more than two groups and treatments.

Then if we see a difference in performance, we can blame it on the treatment. And that, in a nutshell, is how empirical (evidence-based) science works. It’s all about clones and it relies on being able to claim that we have isolated the *only* difference between them. To get anywhere near a claim like that takes finesse and the ultimate attention to detail.

Footnote

*Sure, there’s a whole area in statistics to try to eke some causal conclusions out of situations where a randomized experiment wasn’t possible, but they’re all enough of a reach that not everybody will be convinced by them the way they’d be convinced by a proper experiment. No matter how fancy the math, dissenters will always have something to take issue with, so it takes a thick skin to play with advanced causal inference methods.

How does science work? It’s all about clones!

How does science work? It’s all about clones!

The role of causal inference and statistical reasoning in science

Thinking scientifically

Causal inference

Cautious inference

Why do we trust scientists?

Now’s a good time to rethink our assumptions about fact and fiction

Summary: Causal inference in a nutshell

Footnote

Recommend

Windows 11 Will Come with Brute Force Protection Enabled by Default

老铁们，我加载不出你们的站内头像了这是为啥

抖音主页显示所属MCN机构名称了，你怎么看？

Europe is overheating. This climate-friendly AC could help.

简答一波 HashMap 常见八股面试题 —— 算法系列（2） - 彭旭锐

小团队开发 UCG 类 APP 只能出海了吗？

“接盘侠”现身，猎豹汽车却仍无“来日”

The US is the world’s biggest oil producer — so why do we still need to import c...

Taro框架完美使用Axios - gui.h

The reversal of Roe and criminalization of abortion might be a digital privacy r...

About Joyk