7

论文解读(MERIT)《Multi-Scale Contrastive Siamese Networks for Self-Supervised...

 2 years ago
source link: https://www.cnblogs.com/BlairGrowing/p/16196841.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

论文标题:Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning
论文作者:Ming Jin, Yizhen Zheng, Yuan-Fang Li, Chen Gong, Chuan Zhou, Shirui Pan
论文来源:2021, IJCAI
论文地址:download 
论文代码:download

1 Introduction

  创新:融合交叉视图对比和交叉网络对比。

2 Method

  算法图示如下:

  模型组成部分:

    • Graph augmentations
    • Cross-network contrastive learning
    • Cross-view contrastive learning

2.1 Graph Augmentations

  • Graph Diffusion (GD)

    S=∞∑k=0θkTk∈RN×N(1)

  这里采用 PPR kernel:

    S=α(I−(1−α)D−1/2AD−1/2)−1(2)

  • Edge Modification (EM)

  给定修改比例 P ,先随机删除 P/2 的边,再随机添加P/2 的边。(添加和删除服从均匀分布)

  • Subsampling (SS)

  在邻接矩阵中随机选择一个节点索引作为分割点,然后使用它对原始图进行裁剪,创建一个固定大小的子图作为增广图视图。

  • Node Feature Masking (NFM)

  给定特征矩阵 X 和增强比 P,我们在 X 中随机选择节点特征维数的 P 部分,然后用 0 掩码它们。

  在本文中,将 SS、EM 和 NFM 应用于第一个视图,并将 SS+GD+NFM 应用于第二个视图。

2.2 Cross-Network Contrastive Learning

  MERIT 引入了一个孪生网络架构,它由两个相同的编码器(即 gθ, pθ, gζ 和 pζ)组成,在 online encoder 上有一个额外的预测器qθ,如 Figure 1 所示。

  这种对比性的学习过程如 Figure 2(a) 所示:

    • H1=qθ(Z1)  
    • Z1=pθ(gθ(˜X1,˜A1))  
    • Z2=pθ(gθ(˜X2,˜A2))  
    • ˆZ1=pζ(gζ(˜X1,˜A1))  
    • ˆZ2=pζ(gζ(˜X2,˜A2))  

  参数更新策略(动量更新机制):

    ζt=m⋅ζt−1+(1−m)⋅θt(3)

  其中,m、ζ、θ 分别为动量参数、target network 参数和 online network 参数。

  损失函数如下:

    Lcn=12NN∑i=1(L1cn(vi)+L2cn(vi))(6)

    L1cn(vi)=−logexp(sim(h1vi,ˆz2vi))∑Nj=1exp(sim(h1vi,ˆz2vj))(4)

    L2cn(vi)=−logexp(sim(h2vi,ˆz1vi))∑Nj=1exp(sim(h2vi,ˆz1vj))(5)

2.3 Cross-View Contrastive Learning

  损失函数:

    Lkcv(vi)=Lkintra (vi)+Lkinter (vi),k∈{1,2}(10)

    Lcv=12NN∑i=1(L1cv(vi)+L2cv(vi))(9)

    L1inter (vi)=−logexp(sim(h1vi,h2vi))∑Nj=1exp(sim(h1vi,h2vj))(7)

    L1intra(vi)=−logexp(sim(h1vi,h2vi))exp(sim(h1vi,h2vi))+ΦΦ=N∑j=11i≠jexp(sim(h1vi,h1vj))(8)

2.4 Model Training

    L=βLcv+(1−β)Lcn(11)

3 Experiment

数据集

基线实验

__EOF__


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK