3

Granger Causality: Principle of Cause and Effect Explained

 2 years ago
source link: https://hackernoon.com/granger-causality-and-the-principle-of-cause-and-effect-explained
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Site Color

hex

Text Color

Ad Color

hex

Text Color

Evergreen

Duotone

Mysterious

Classic

Sign Up to Save Your Colors

Granger Causality: Principle of Cause and Effect Explained by@nikolao

Granger Causality: Principle of Cause and Effect Explained

In a world full of data, we can understand the impact of impact with clever methods. Meet Granger causality: cause precedes effect. The method assumes stationarity of the data - data free from time-related biases. The main difference from a standard experiment is the level of certainty that the observed effect is really something. Natural experiments and granger causality are alternatives and could be classified as quasi-experimental approaches for time-series data. For instance, we could answer a question of whether a rise in interest in heat waves preceeds increased interest in climate change.

Listen to this story

Speed:
Read by:
voice-avatar
Nikola O.

Combines ideas from data science, humanities and social sciences. Enjoys thinking, science fiction and design.

Without a doubt, impact analysis is an essential field of statistics. Many people associate the effect with a lot of work - designing an experiment, collecting the data, etc. However, in a world full of data, we can understand the impact with clever methods. Meet Granger causality.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Granger Causality

Clive Granger, the author of the idea, writes:

0 reactions
heart.png
light.png
money.png
thumbs-down.png

“The definition [of causality] is based entirely on the predictability of some series, say Xt. If some other series Yt contains information in past terms that helps in the prediction of Xt and if this information is contained in no other series used in the predictor, then Yt is said to cause Xt. The flow of time clearly plays a central role in these definitions. In the author’s opinion, there is little use in the practice of attempting to discuss causality without introducing time, although philosophers have tried to do so.”

This translates to cause precedes effect.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

I like the snarky comment at the end. For one, isn’t it interesting how scientists used to write papers? The main reason is how it points to the fact that there are other opinions. If you ask me, I would say that this is just common sense. Of course, cause precedes effect. I can’t first break a window and then throw a ball that breaks it, right?

0 reactions
heart.png
light.png
money.png
thumbs-down.png

There are many ways to think about causality. For example, Bertrand Russell, a well-known philosopher, was sceptical about the importance of causality and compared it to “a relic of a bygone age”. The idea suggests that since fundamental theories of how the world works (e.g. physics), the concept of causation is not very useful. Physical theories use equations that are symmetrical - both sides will change depending on what variable you are solving for. But causal relationships are asymmetrical.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

I took this example from the Internet Encyclopedia of Philosophy. If you haven’t scratched your head in a while, you might enjoy it.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

The fundamental truth about causality is less relevant when it comes to practice. However, we still call it Granger causality because it is one of many interpretations of causality.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Granger Causality vs Natural Experiments

Both Granger causality and natural experiments are approaches to causal analysis without performing standard experiments. Traditional experiments first require design, and the second step is data collection. Only then do scientists proceed to analyse the data. I wrote a story about natural experiments and how they relate to traditional experiments; it’s a nuanced topic. Long story short, the main difference is the level of certainty. 

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Performing an experiment in a controlled environment gives us more certainty that the observed effect is really something. When we design an experiment, we think about controlling as many factors (i.e. other possible causes) as possible. We only change one factor (i.e. the cause we wish to study) at the time. At least that’s the theory; it’s not easy to do this in social sciences.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Natural experiments and granger causality are alternatives and could be classified as quasi-experimental approaches. The distinction is that natural experiments require a bit of luck. At the same time, Granger causality can be performed on any stationary time series (I will explain what that is in the next section).

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Granger Causality in Practice

The primary application of Granger’s methods was econometrics, but it goes beyond that. All the areas capturing information in time can make use of Granger causality.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Your data needs to be collected in regular intervals, e.g. daily, weekly, monthly, etc. The method assumes stationarity of the data. You can think of stationarity as data free from time-related biases. These are often present in practice. However, there are tools to handle them, and the simplest one is differencing. 

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Differenced time-series are differences between consequent time points. In other words, instead of using the actual values such as [5, 9, 6, 4], you would use their differences [ 4, -3, -2]. Sometimes, you need to difference twice to achieve stationarity.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Let’s say you want to know whether people become more interested in climate change after a heat wave. We can download data from Google trends.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Ok, so how do you go about this?

0 reactions
heart.png
light.png
money.png
thumbs-down.png

By looking at the graph, we could think that the increase in the relative search volumes about heat waves comes before the rise in interest in climate change. But to do this properly, we need to first make sure that the data is stationary. For instance, we can see that interest in heat waves increases in summer - this is seasonality which could bias our analysis, so that’s one of the corrections we need to make. We test for stationarity with a statistical test, e.g. Augmented Dickey-Fuller.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Next, feed this data into a multivariate regression model called Vector Autoregressive (VAR) Model. Autoregression means that past values of one variable, e.g. interest in climate change, predict the current value of the same variable, i.e. interest in climate change. In causal analysis, we include past values of another variable together with the autoregressive predictors, e.g. past interest in climate change + past interest in heat waves.

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Finally, we compare whether the simple autoregressive model (without past interest in heat waves) performs “significantly” worse. As you can never be sure which variable is granger-causing which, it is common to complete the procedure both ways: heat waves (H) → climate change (C), and climate change (C) → heat waves (H).

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Here is a visual representation of the process for one direction:

0 reactions
heart.png
light.png
money.png
thumbs-down.png

Based on the results, the answer for UK data is no. Working with time series is tricky, and you can easily fool yourself by looking at it. Therefore, something like Granger causality is handy. 

0 reactions
heart.png
light.png
money.png
thumbs-down.png

If you want to try out the analysis for yourself, here is the R code I used to get the answer.

0 reactions
heart.png
light.png
money.png
thumbs-down.png
10
heart.pngheart.pngheart.pngheart.png
light.pnglight.pnglight.pnglight.png
boat.pngboat.pngboat.pngboat.png
money.pngmoney.pngmoney.pngmoney.png
by Nikola O. @nikolao.Combines ideas from data science, humanities and social sciences. Enjoys thinking, science fiction and design.
Join Blog Waiting List
Customized Experience.|

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK