![](/style/images/good.png)
![](/style/images/bad.png)
Frisch-Waugh-Lovell Theorem: Animated
source link: https://ryxcommar.com/2020/12/26/frisch-waugh-lovell-theorem-animated/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Frisch-Waugh-Lovell Theorem: Animated
Update: The code for these animations is available here.
The Frisch-Waugh-Lovell theorem states that within a multivariate regression on and
, the coefficient for
, which is
, will be the exact same as if you had instead run a regression on the residuals of
and
after regressing each one on
separately.
The point of this post is not to explain the FWL theorem in linear algebraic detail, or explain why it’s useful (basically, it’s a fundamental intuition about what multivariate regression does and what it means to “partial” out the effects of two regressors). If you want to learn more about that, there’s some great stuff already on Google.
The point of this post is to simply provide an animation of this theorem. I find that the explanations of this theorem are often couched in lots of linear algebra, and it may be hard for some people to understand what’s going on exactly. I hope this animation can help with that.
Our Data
import
numpy as np
import
pandas as pd
import
statsmodels.api as sm
np.random.seed(
42069
)
df
=
pd.DataFrame({
'x1'
: np.random.uniform(
0
,
10
, size
=
50
)})
df[
'x2'
]
=
4.9
+
df[
'x1'
]
*
0.983
+
2.104
*
np.random.normal(
0
,
1.35
, size
=
50
)
df[
'y'
]
=
8.643
-
2.34
*
df[
'x1'
]
+
3.35
*
df[
'x2'
]
+
np.random.normal(
0
,
1.65
, size
=
50
)
df[
'const'
]
=
1
model
=
sm.OLS(
endog
=
df[
'y'
],
exog
=
df[[
'const'
,
'x1'
,
'x2'
]]
).fit()
model.summary()
The output of the above:
OLS Regression Results
Dep. Variable:yR-squared:0.977Model:OLSAdj. R-squared:0.976Method:Least SquaresF-statistic:997.5Date:Sat, 26 Dec 2020Prob (F-statistic):3.22e-39Time:17:11:39Log-Likelihood:-95.281No. Observations:50AIC:196.6Df Residuals:47BIC:202.3Df Model:2Covariance Type:nonrobust
coefstd errtP>|t|[0.0250.975]const9.46730.54617.3370.0008.36910.566x1-2.20030.128-17.2130.000-2.458-1.943x23.19310.08139.6470.0003.0313.355
Omnibus:0.120Durbin-Watson:1.914Prob(Omnibus):0.942Jarque-Bera (JB):0.279Skew:-0.095Prob(JB):0.870Kurtosis:2.687Cond. No.27.3
The Animation
Here is what would happen if we actually ran a univariate regression on the residuals after factoring out .
(The animation takes a few seconds, so you might need to wait for it to restart to get the full effect.)
![fwl_x2.gif?w=569](https://ryxcommarhome.files.wordpress.com/2020/12/fwl_x2.gif?w=569)
Notice that the slope in the final block ends up equaling 3.1931
, which is the coefficient for in the multivariate regression.
Getting the coefficient for is more interesting; one thing that happens in the multivariate regression is the coefficient
is negative despite the fact that
is positively correlated with
. What gives? Well, the following animation helps to show where that comes from:
![fwl_x1.gif?w=569](https://ryxcommarhome.files.wordpress.com/2020/12/fwl_x1.gif?w=569)
You can mostly see here what’s happening: After we take out the effect of on
, what we’re left over with is a negative relationship between
and
. Put another way: there is a negative correlation between
and the residuals from the regression
.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK