GDA | 随机预测（Stochastic Prediction）

发表于

2020-12-25 更新于 2021-01-26 分类于 GIS

模拟一个信号 S(t―N,w―)，其中，t―N 是一个规律性分布的点的集合，点集大小为 N。 (1)t―N=[01...N−1]

# sample size
N = 1000
# observation epochs (we assume to sample the signal on a regular basis
t = np.arange(0, N, 1).reshape((N, 1)) # a column vector

假设该信号平稳（stationary），则该信号的协方差函数为（指数模型）： (2)CS(|t−t′|)=A0e(−α0|t−t′|)

# signal covariance model
# variance of the random signal
A0 = 9
# decay speed of the covariance model
a0 = 1/100
# exponential model
cf = A0 * np.exp(-a0*t) # a column vector

对应的协方差矩阵为: (3)CSN×N=[CS(0)CS(1)CS(2)...CS(N−1)CS(1)CS(0)CS(1)...CS(N−2)...CS(N−1)CS(N−2)CS(N−3)...CS(0)]=[A0A0e−α0×1A0e−α0×2...A0e−α0×(N−1))A0e−α0×1A0A0e−α0×1...A0e−α0×(N−2))...A0e−α0×(N−1))A0e−α0×(N−2))A0e−α0×(N−2))...A0]

由于数据规律性分布（△=1），因此，其协方差矩阵是 T 型矩阵。

T型矩阵（Toeplitz matrix）：主对角线上的元素相等，平行于主对角线的线上的元素也相等；矩阵中的各元素关于次对角线对称，即T型矩阵为次对称矩阵。

from scipy.linalg import toeplitz
# covariance matrix of the given N-dimensional sample variable 
C = toeplitz(cf)

将协方差矩阵进行 Cholesky 分解：CS=L×L+

Cholesky 分解：把一个对称正定的矩阵表示成一个下三角矩阵 L 和其转置的乘积的分解。它要求矩阵的所有特征值必须大于零，故分解的下三角的对角元也是大于零的。

# Cholesky decomposition C=LxL' (L=lower triangular matrix)
L = np.linalg.cholesky(C)

假设 X―N 是一个满足正态分布的随机变量，具有以下性质： (4)E{X―N}=0CX―X―=I

令随机变量 S(t―N,w) 为 X―N 的一个线性变换： (5)S(t―N,w)=L×X―N 则其具有以下性质：满足上面的分解(6)E{S(t―N,w)}=L×E{X―N}=0―CSS=L×CX―X―×L+=L×L+(满足上面的 Cholesky 分解)

故而模拟的信号可以由以下等式给出： (7)S(t―N)=L×x―N0

其中，x―N0 是 X―N 的一个样本

# signal sampling simulation 
y = L.dot(np.random.randn(N, 1))

增加随机白噪声（random white-noise）v―，其方差为 σv2。

# noise standard deviation 40% of signal
sigma_v = 0.4 * np.sqrt(cf[0,0])
# white noise generation (normal white noise)
v = sigma_v * np.random.randn(N,1)

该白噪声与信号不相关（Uncorrelated）。

因此，最终模拟的数据为： (8)y―0=S(t―N)+v―

# observations (sampled signal + white noise
y0 = y + v

绘图查看：

估测经验协方差函数（ESTIMATION OF THE ECF）

# observation mean removal
y0_m = y0 - np.mean(y0)
# max length for ecf evaluation
N2 = int(N/2)
# ecf: xcorr(y0_m, N2)
ecf = np.correlate(y0_m.T[0], y0_m.T[0], mode="full")[len(y0_m)-1-N2:len(y0_m)+N2]

ecf = ecf[N2:2*N2+1].reshape((N2+1, 1))
# biased, definite postive
ecf = ecf / N

注意：每次运行得到的协方差系数是不一样的，而我们用于计算的模型基于观测值（样本）和噪音。因此，不同的观测值（样本）得到的模型不同。

插值法（INTERPOLATION OF THE ECF）估测协方差函数 CF^

这里，我们使用指数模型： (9)CF=A0e(−α0(|t−t′|))

使用最小二乘法（LS）估测参数 A0^ 和 α0^。

1. 确定参数的近似值 A~ 和 α~

利用第 2 和第 3 个协方差系数来估算 A~，利用相关长度（correlation length）估算 α~(10)A~=2×CF(1)−CF(2)σ0^2=CF(0)−A~A~2=A~∗e(−α~(|τ¯|))ln12=−α~(|τ¯|)−ln2=−α~(|τ¯|)α~=ln2|τ¯|

# approximated values of the parameters for linearization
## intersection with t=0 of the straight line
## exactly interpolating the covariance
## coefficients at step 1 and 2
Ast01 = 2*ecf[1,0] - ecf[2,0]
Ast0 = Ast01
# noise variance estimate
sigma2st0_v = ecf[0,0] - Ast0

ecf_05 = Ast0/2
i_lcorr = np.argwhere(ecf-ecf_05<=0).min(0)[0]
t_lcorr = t[i_lcorr-1,0] # correlation length
# assumption: A/ecf(lcorr) = 2
ast01 = np.log(2)/t_lcorr

# approximated cf model
ecm01 = Ast0 * np.exp(-ast01*t)

2. 迭代估测

这里，我们仅迭代五次：

i_up = np.argwhere(ecf[1:]-ecf[:-1]>=0).min(0)[0]
t_up = t[i_up,0]
#print(i_up, t[1:i_up+1])
# point selection for LS interpolation
t0 = t[1:i_up+1] # confusion here!!!
# ecf values to be interpolated
ecf0 = ecf[1:i_up+1]

# first iteration
Ast = Ast01
ast = ast01

# cf
ecm = Ast * np.exp(-ast*t)
plt.plot(t[:M+1], ecm[:M+1], 'r', label='ecm')
for i in range(5):
    # design matrix (Jacobian)
    A = np.hstack([np.exp(-ast*t0),-Ast*t0*np.exp(-ast*t0)])
    # known terms vector
    a = np.asmatrix(Ast * np.exp(-ast*t0))
    #print(A, A.shape, a.shape, ecf0.shape, t0.shape)
    # estimated parameters
    xst = inv(A.T.dot(A)).dot(A.T).dot(ecf0-a)
    #print(xst, xst.shape)
    # i-th iteration
    Ast += xst[0,0]
    ast += xst[1,0]
    #print(Ast, ast, M)
    # cf
    ecm = Ast * np.exp(-ast*t)
    #print(t.shape, ecm.shape)
    plt.plot(t[:M], ecm[:M], 'r')

sigma2st_v = ecf[0,0] - Ast

使用观测值和协方差函数 CF^ 预测指定点集的信号值

观测值：y―0
协方差函数： (11)CF^=A0^e(−α0^(|t−t′|))
预测： (12)y^=Cyy×(Cyy+k×(σ0^2×IN×N))−1×y0

# number of observations to filter
n = 100

# Covariance matrix of the sampled signal
Cyy = toeplitz(ecm[:n+1])

# Covariance matrix of the sampled signal + noise
Cy0y0 = Cyy + np.eye(n+1)*sigma2st_v

# filtered signal
yst = Cyy.dot(inv(Cy0y0)).dot(y0_m[:n+1])
est = y[:n+1] - yst

print(""" 
estimation error:
   mean\t\t= {}
   std\t\t= {}
""".format(np.mean(est), np.std(est)))

# Covariance matrix of the sampled signal
Cyy = toeplitz(ecm[:n+1])
# Covariance matrix of the sampled signal + noise
Cy0y0 = Cyy + 9*np.eye(n+1)*sigma2st_v

# filtered signal
yst = Cyy.dot(inv(Cy0y0)).dot(y0_m[:n+1])

est = y[:n+1] - yst

print("""
estimation error:
   mean\t\t= {}
   std\t\t= {}
""".format(np.mean(est), np.std(est)))

可以看出，k 值越大，noise 的方差越大，估测信号越平滑。

如果我们修改相关长度（correlation length）的大小：

ecmErr = Ast * np.exp(-0.1*ast*t)
Cyy = toeplitz(ecmErr[:n+1])

# Covariance matrix of the sampled signal + noise
Cy0y0 = Cyy + 9*np.eye(n+1)*sigma2st_v # smoother

# filtered signal
yst = Cyy.dot(inv(Cy0y0)).dot(y0_m[:n+1])

est = y[:n+1] - yst

print("""
estimation error:
   mean\t\t= {}
   std\t\t= {}
""".format(np.mean(est), np.std(est)))

可以看出，相关长度越大，估测信号越平滑；相关长度越小，用以估测信号的观测值越少，估测信号越接近观测值，过滤能力越弱。

GDA | 随机预测（Stochastic Prediction）

GDA | 随机预测（Stochastic Prediction）

估测经验协方差函数（ESTIMATION OF THE ECF）

插值法（INTERPOLATION OF THE ECF）估测协方差函数 CF^

1. 确定参数的近似值 A~ 和 α~

2. 迭代估测

使用观测值和协方差函数 CF^ 预测指定点集的信号值

Recommend

SourceTree 中链接注释中的 issue id 到Kelude

深入理解 Go | panic 和 recover

「论文分享」学习连续控制问题的可解释的高性能策略

Minikube

Token

Zip

godep

Golang中通讯

What Is the Fediverse?

curl

About Joyk