【相关分析】理论与实现

2017年11月21日

Author: Guofei

文章归类: 4-1-统计模型，文章编号: 407

版权声明：本文作者是郭飞。转载随意，但需要标明原文链接，并通知本人
原文链接：https://www.guofei.site/2017/11/21/corr.html

Edit

工具

X\ Y 分类连续分类交叉表（列联表） ttest
ANOVA 连续 ttest
ANOVA 相关分析

参看统计推断

相关系数定义描述 H0 统计量代码 Pearson r=cov(x,y)DxDy−−−−−√r=cov(x,y)DxDy 成对的连续数据
接近正态的单峰分布 r=0r=0 t=rn−2−−−−−√1−r2∼t(n−2)t=rn−21−r2∼t(n−2) r, p_value
= stats.pearsonr Spearman 计算秩的pearson，等价于：
r=1−6∑d2in(n2−1)r=1−6∑di2n(n2−1)
di=Ri−Qidi=Ri−Qi 成对的等级数据
无论分布 r=0 小样本：参数为n-2的 Spearman 分布
大样本：t=rn−2−−−−−√1−r2∼t(n−2)t=rn−21−r2∼t(n−2) stats.spearmanr Kendall τa=2(C−D)/(n(n−1))τa=2(C−D)/(n(n−1))
τb=(P−Q)/(P+Q+T)∗(P+Q+U)−−−−−−−−−−−−√τb=(P−Q)/(P+Q+T)∗(P+Q+U) 小样本：Kendall分布
大样本U=3τn(n−1)2(2n−5)−−−−−−−−√U=3τn(n−1)2(2n−5) stats.kendalltau
stats.weightedtau

范围(-1,1)，-1:完全负相关，1：完全正相关，0：不相关
pearson的358原则:
- ∣r∣≥0.8∣r∣≥0.8表示两个变量高度相关
- ∣r∣∈[0.5,0.8]∣r∣∈[0.5,0.8]表示两个变量中度相关
- ∣r∣∈[0.3,0.5]∣r∣∈[0.3,0.5]表示两个变量低度相关
- ∣r∣∈[0,0.3]∣r∣∈[0,0.3]表示两个变量几乎不相关
Kendall
This is the 1945 “tau-b” version of Kendall’s tau τ=(P−Q)/sqrt((P+Q+T)∗(P+Q+U))τ=(P−Q)/sqrt((P+Q+T)∗(P+Q+U)) where P is the number of concordant pairs, Q the number of discordant pairs, T the number of ties only in x, and U the number of ties only in y. If a tie occurs for the same pair in both x and y, it is not added to either T or U.
（1938 “tau-a” version）

代码示例

from scipy import stats
import numpy as np
n = 10
x = np.random.rand(n)
y = np.random.rand(n)

# Pearson
r, p_value = stats.pearsonr([1,2,3,4,5], [5,6,7,8,7])

# Spearman
```python
tau, pvalue = stats.spearmanr(x,y)

# 如果输入 n×m 的数据，返回的是相关系数矩阵
x = np.random.rand(n,3)
tau, p_value = stats.spearmanr(x)

# Kendall
tau, p_value = stats.kendalltau(x,y)
# 用的是 tau_b 算法

stats.theilslopes
stats.weightedtau

列联表分析

以两离散变量分别都是两类举例

H0：X，Y独立
H1：X，Y不独立

step1：取得源数据

0 1 0 n11 n12 1 n21 n22

step2：求边缘密度

0 1 0 n11 n12 a1=(n11+n12)/n 1 n21 n22 a2=(n21+n22)/n b1=(n11+n21)/n b2=(n12+n22)/n

step3：求期望概率（假设独立）

0 1 0 a1×b1 a1×b1 a1 1 a1×b1 a1×b1 a2 b1 b2

step4:求期望频数

0 1 0 a1×b1×n a1×b1×n 1 a1×b1×n a1×b1×n

step5：期望频数与原频数的差，得到的数字平方和后服从卡方分布

step6：卡方检验

参考资料

您的支持将鼓励我继续创作！

【相关分析】理论与实现