CVPR 2022 最全整理：论文分方向汇总

CVPR 2022 最全整理：论文分方向汇总 / 代码 / 解读 / 直播 / 项目（更新中）【计算机视觉】精选

论文速递

Admin · 发表于 2022-02-22 18:56:13 文章来源: 极市平台 #CVPR#会议#论文

CVPR 2022 已经放榜啦，本次一共有2067篇论文被接收，接收论文数量相比去年增长了24%。由于每年的 CVPR 全部论文以及相关细节公布都需要等到六月会议正式召开，因此，在这之前，为了让大家更快地获取和学习到计算机视觉前沿技术，极市会进 CVPR 相关的论文资源整理，包括最新CVPR论文解读、代码、技术直播、分方向盘点等，这些整理基本每天都会针对当天新增的论文进行更新。

官网链接：http://CVPR2022.thecvf.com
时间：2021年6月19日-6月24日
论文接收公布时间：2022年3月2日
相关问题：如何评价 CVPR2022 的论文接收结果？
相关报道：CVPR 2022 接收结果出炉！录用 2067 篇，接收数量上升24%

关于CVPR2022所有的内容都会持续更新在本帖，同时，此前所有CVPR的论文整理都汇总在了我们的Github项目中，该项目目前已收获8500 Star，欢迎大家关注：

https://github.com/extreme-assistant/CVPR2022-Paper-Code-Interpretation

update:

2022/3/3 更新 19 篇：涵盖网络架构设计、姿态估计、三维视觉、动作检测、语义分割等方向，附打包下载
2022/3/4 更新 29 篇涵盖目标检测、全景分割、异常检测、度量学习、对比学习、目标跟踪等方向
2022/3/7 更新 17 篇涵盖 3D 目标检测、医学影像、图像去模糊、车道线检测等方向
2022/3/9 更新 57 篇涵盖目标检测、语义分割、人群计数、异常检测等方向

1. CVPR2022 接受论文/代码分方向汇总（更新中）

2. CVPR2022 Oral（更新中）

3. CVPR2022 论文解读汇总（更新中）

4. CVPR2022 极市论文分享

 5. To do list

1.CVPR2022接受论文/代码分方向整理(持续更新)

分类目录：

5. 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

10. 文本检测/识别(Text Detection/Recognition)

11. 遥感图像(Remote Sensing Image)

12. GAN/生成式/对抗式(GAN/Generative/Adversarial)

13. 图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

视图合成(View Synthesis)

16. 视觉推理/视觉问答(Visual Reasoning/VQA)

17. 图像分类(Image Classification)

18. 神经网络结构设计(Neural Network Structure Design)

20. 模型训练/泛化(Model Training/Generalization)

24. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)

25. 持续学习(Continual Learning/Life-long Learning)

26. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

28. 对比学习(Contrastive Learning)

29. 增量学习(Incremental Learning)

30. 强化学习(Reinforcement Learning)

32. 多模态学习(Multi-Modal Learning)

33. 视觉预测(Vision-based Prediction)

36. 自监督学习/半监督学习(Self-supervised Learning/Semi-supervised Learning)

37. 神经网络可解释性(Neural Network Interpretability)

40. 图像特征提取与匹配(Image feature extraction and matching)

2D 目标检测(2D Object Detection)

[2] Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild(未知感知对象检测：从野外视频中学习你不知道的东西)

paper | code

[1] Localization Distillation for Dense Object Detection(密集对象检测的定位蒸馏)

keywords: Bounding Box Regression, Localization Quality Estimation, Knowledge Distillation

paper | code

解读：南开程明明团队和天大提出LD：目标检测的定位蒸馏

视频目标检测(Video Object Detection)

[1] Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering(通过联合表示学习和在线聚类进行无监督活动分割)

paper | video

3D目标检测(3D object detection)

[2] A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation(在全景分割的指导下，用于基于 LiDAR 的 3D 对象检测的多功能多视图框架)

keywords: 3D Object Detection with Point-based Methods, 3D Object Detection with Grid-based Methods, Cluster-free 3D Panoptic Segmentation, CenterPoint 3D Object Detection

paper

[1] Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving(自动驾驶中用于单目 3D 目标检测的伪立体)

keywords: Autonomous Driving, Monocular 3D Object Detection

paper | code

人物交互检测(HOI Detection)

伪装目标检测(Camouflaged Object Detection)

[1] Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection(放大和缩小：用于伪装目标检测的混合尺度三元组网络)

paper | code

旋转目标检测(Rotation Object Detection)

显著性检测(Saliency Object Detection)

图像异常检测(Anomally Detection in Image)

关键点检测(Keypoint Detection)

车道线检测(Lane Detection)

[1] Rethinking Efficient Lane Detection via Curve Modeling(通过曲线建模重新思考高效车道检测)

keywords: Segmentation-based Lane Detection, Point Detection-based Lane Detection, Curve-based Lane Detection, autonomous driving

paper | code

分割(Segmentation)

图像分割(Image Segmentation)

全景分割(Panoptic Segmentation)

[1] Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation(弯曲现实：适应全景语义分割的失真感知Transformer)

keywords: Semantic- and panoramic segmentation, Unsupervised domain adaptation, Transformer

paper | code

语义分割(Semantic Segmentation)

[8] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels(使用不可靠伪标签的半监督语义分割)

paper | code | project

[7] Weakly Supervised Semantic Segmentation using Out-of-Distribution Data(使用分布外数据的弱监督语义分割)

paper | code

[6] Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation(弱监督语义分割的自监督图像特定原型探索)

paper | code

[5] Multi-class Token Transformer for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的多类token Transformer)

paper | code

[4] Cross Language Image Matching for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的跨语言图像匹配)

paper

[3] Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers(从注意力中学习亲和力：使用 Transformers 的端到端弱监督语义分割)

paper | code

[2] ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation(让自我训练更好地用于半监督语义分割)

keywords: Semi-supervised learning, Semantic segmentation, Uncertainty estimation

paper | code

[1] Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation(弱监督语义分割的类重新激活图)

paper | code

实例分割(Instance Segmentation)

[3] E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation(一种基于端到端轮廓的高质量高速实例分割方法)

paper | code

[2] Efficient Video Instance Segmentation via Tracklet Query and Proposal(通过 Tracklet Query 和 Proposal 进行高效的视频实例分割)

paper

[1] SoftGroup for 3D Instance Segmentation on Point Clouds(用于点云上的 3D 实例分割)

keywords: 3D Vision, Point Clouds, Instance Segmentation

paper | code

超像素(Superpixel)

视频目标分割(Video Object Segmentation)

抠图(Matting)

密集预测(Dense Prediction)

估计(Estimation)

姿态估计(Human Pose Estimation)

[3] Forecasting Characteristic 3D Poses of Human Actions()

paper | project | video

[2] Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation(学习用于多人姿势估计的局部-全局上下文适应)

keywords:Top-Down Pose Estimation(从上至下姿态估计), Limb-based Grouping, Direct Regression

paper

[1] MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video(用于视频中 3D 人体姿势估计的 Seq2seq 混合时空编码器)

keywords：3D Human Pose Estimation, Transformer

paper

手势估计(Gesture Estimation)

光流/位姿/运动估计(Optical Flow/Pose/Motion Estimation)

[3] CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild(CPPF：在野外实现稳健的类别级 9D 位姿估计)

paper | code

[2] OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation(用于基于深度的 6D 对象姿态估计的对象视点编码)

paper | code

[1] CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation(用于联合光流和场景流估计的双向相机-LiDAR 融合)

paper

深度估计(Depth Estimation)

[6] Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation and Focal Loss(重新思考多视图立体的深度估计：统一表示和焦点损失)

paper | code

[5] ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks(立体匹配网络中自动避免捷径和域泛化的信息论方法)

keywords: Learning-based Stereo Matching Networks, Single Domain Generalization, Shortcut Learning

paper

[4] Attention Concatenation Volume for Accurate and Efficient Stereo Matching(用于精确和高效立体匹配的注意力连接体积)

keywords: Stereo Matching, cost volume construction, cost aggregation

paper | code

[3] Occlusion-Aware Cost Constructor for Light Field Depth Estimation(光场深度估计的遮挡感知成本构造函数)

paper | [code](https://github.com/YingqianWang/OACC- Net)

[2] NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation(用于单目深度估计的神经窗口全连接 CRF)

keywords: Neural CRFs for Monocular Depth

paper

[1] OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion(通过几何感知融合进行 360 度单目深度估计)

keywords: monocular depth estimation(单目深度估计),transformer

paper

图像处理(Image Processing)

超分辨率(Super Resolution)

[4] Reflash Dropout in Image Super-Resolution(图像超分辨率中的闪退dropout)

paper

[3] Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence(迈向双向任意图像缩放：联合优化和循环幂等)

paper

[2] HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening(用于全色锐化的纹理和光谱特征融合Transformer)

paper ｜ code

[1] HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging(光谱压缩成像的高分辨率双域学习)

keywords: HSI Reconstruction, Self-Attention Mechanism, Image Frequency Spectrum Analysis

paper

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)

[1] Event-based Video Reconstruction via Potential-assisted Spiking Neural Network(通过电位辅助尖峰神经网络进行基于事件的视频重建)

paper

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

图像去噪/去模糊/去雨去雾(Image Denoising)

[1] E-CIR: Event-Enhanced Continuous Intensity Recovery(事件增强的连续强度恢复)

keywords: Event-Enhanced Deblurring, Video Representation

paper | code

图像编辑/图像修复(Image Edit/Inpainting)

[2] HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)

keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks

paper | project

[1] Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding(增量transformer结构增强图像修复与掩蔽位置编码)

keywords: Image Inpainting, Transformer, Image Generation

paper | code

图像翻译(Image Translation)

[1] Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks(探索图像到图像翻译任务中对比学习的补丁语义关系)

keywords: image translation, knowledge transfer,Contrastive learning

paper

图像质量评估(Image Quality Assessment)

风格迁移(Style Transfer)

[2] Style-ERD: Responsive and Coherent Online Motion Style Transfer(响应式和连贯的在线运动风格迁移)

paper

[1] CLIPstyler: Image Style Transfer with a Single Text Condition(具有单一文本条件的图像风格转移)

keywords: Style Transfer, Text-guided synthesis, Language-Image Pre-Training (CLIP)

paper

人脸(Face)

人脸识别/检测(Facial Recognition/Detection)

[1] An Efficient Training Approach for Very Large Scale Face Recognition(一种有效的超大规模人脸识别训练方法)

paper | code

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)

[1] Sparse to Dense Dynamic 3D Facial Expression Generation(稀疏到密集的动态 3D 面部表情生成)

keywords: Facial expression generation, 4D face generation, 3D face modeling

paper

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

[2] Voice-Face Homogeneity Tells Deepfake

paper | code

[1] Protecting Celebrities with Identity Consistency Transformer(使用身份一致性transformer保护名人)

paper

目标跟踪(Object Tracking)

[3] TCTrack: Temporal Contexts for Aerial Tracking(空中跟踪的时间上下文)

paper | code

[2] Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds(超越 3D 连体跟踪：点云中 3D 单对象跟踪的以运动为中心的范式)

keywords: Single Object Tracking, 3D Multi-object Tracking / Detection, Spatial-temporal Learning on Point Clouds

paper

[1] Correlation-Aware Deep Tracking(相关感知深度跟踪)

paper

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

[1] BEVT: BERT Pretraining of Video Transformers(视频Transformer的 BERT 预训练)

keywords: Video understanding, Vision transformers, Self-supervised representation learning, BERT pretraining

paper | code

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

[4] End-to-End Semi-Supervised Learning for Video Action Detection(视频动作检测的端到端半监督学习)

paper

[3] Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos(模态特定注释视频上多模态动作识别的可学习不相关模态丢失)

paper

[2] Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation(通过代表性片段知识传播的弱监督时间动作定位)

paper | code

[1] Colar: Effective and Efficient Online Action Detection by Consulting Exemplars(通过咨询示例进行有效且高效的在线动作检测)

keywords:Online action detection(在线动作检测)

paper

行人重识别/检测(Re-Identification/Detection)

图像/视频字幕(Image/Video Caption)

[1] X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
keywords：Image Captioning and Dense Captioning(图像字幕/密集字幕)；Knowledge distillation(知识蒸馏)；Transformer；3D Vision(三维视觉)

paper

医学影像(Medical Imaging)

[2] Adaptive Early-Learning Correction for Segmentation from Noisy Annotations(从噪声标签中分割的自适应早期学习校正)

keywords: medical-imaging segmentation, Noisy Annotations

paper | code

[1] Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations(时间上下文很重要：使用疾病进展表示增强单图像预测)

keywords: Self-supervised Transformer, Temporal modeling of disease progression

paper

文本检测/识别(Text Detection/Recognition)

遥感图像(Remote Sensing Image)

GAN/生成式/对抗式(GAN/Generative/Adversarial)

[4] Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon(阴影可能很危险：自然现象的隐秘而有效的物理世界对抗性攻击)

paper

[3] Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer(保护面部隐私：通过风格稳健的化妆转移生成对抗性身份面具)

paper

[2] Adversarial Texture for Fooling Person Detectors in the Physical World(物理世界中愚弄人探测器的对抗性纹理)

paper

[1] Label-Only Model Inversion Attacks via Boundary Repulsion(通过边界排斥的仅标签模型反转攻击)

paper

图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

[6] Exploring Dual-task Correlation for Pose Guided Person Image Generation(探索姿势引导人物图像生成的双任务相关性)

paper | code

[5] Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning(告诉我什么并告诉我如何：通过多模式调节进行视频合成)

paper | code

[4] 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces(基于小批量特征交换的三维形状变化自动编码器潜在解纠缠)

paper | code

[3] Interactive Image Synthesis with Panoptic Layout Generation(具有全景布局生成的交互式图像合成)

[paper])(https://arxiv.org/abs/2203.02104)

[2] Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values(极性采样：通过奇异值对预训练生成网络的质量和多样性控制)

paper | demo

[1] Autoregressive Image Generation using Residual Quantization(使用残差量化的自回归图像生成)

paper | code

视图合成(View Synthesis)

三维视觉(3D Vision)

[1] X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
关键词：图像字幕/密集字幕；知识蒸馏；Transformer；三维视觉

paper

点云(Point Cloud)

[5] Shape-invariant 3D Adversarial Point Clouds(形状不变的 3D 对抗点云)

paper | code

[4] ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation(通过对抗旋转提高点云分类器的旋转鲁棒性)

paper

[3] Lepard: Learning partial point cloud matching in rigid and deformable scenes(Lepard：在刚性和可变形场景中学习部分点云匹配)

paper | code

[2] A Unified Query-based Paradigm for Point Cloud Understanding(一种基于统一查询的点云理解范式)

paper

[1] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding(用于 3D 点云理解的自监督跨模态对比学习)

keywords: Self-Supervised Learning, Contrastive Learning, 3D Point Cloud, Representation Learning, Cross-Modal Learning

paper | code

三维重建(3D Reconstruction)

[3] Neural Face Identification in a 2D Wireframe Projection of a Manifold Object(流形对象的二维线框投影中的神经人脸识别)

paper | [code](https://manycore- research.github.io/faceformer) | project

[2] Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers()

keywords: semantic segmentation, 3D reconstruction, 3D bio-printers

paper

[1] H4D: Human 4D Modeling by Learning Neural Compositional Representation(通过学习神经组合表示进行人体 4D 建模)

keywords: 4D Representation(4D 表征),Human Body Estimation(人体姿态估计),Fine-grained Human Reconstruction(细粒度人体重建)

paper

场景重建/新视角合成(Novel View Synthesis)

[3] Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)

paper ｜ code |project

[2] CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)

keywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)

paper | code

[1] Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)

paper | code | project

模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

剪枝(Pruning)

量化(Quantization)

[1] IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization(学习具有类内异质性的合成图像以进行零样本网络量化)

paper | code

神经网络结构设计(Neural Network Structure Design)

[1] BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning(学习探索样本关系以进行鲁棒表征学习)

keywords: sample relationship, data scarcity learning, Contrastive Self-Supervised Learning, long-tailed recognition, zero-shot learning, domain generalization, self-supervised learning

paper | code

[2] DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos(视频中稀疏帧差异的端到端 CNN 推断)

keywords: sparse convolutional neural network, video inference accelerating

paper

[1] A ConvNet for the 2020s

paper | code

解读：“文艺复兴” ConvNet卷土重来，压过Transformer！FAIR重新设计纯卷积新架构

Transformer

[2] Delving Deep into the Generalization of Vision Transformers under Distribution Shifts(深入研究分布变化下的视觉Transformer的泛化)

keywords: out-of-distribution (OOD) generalization, Vision Transformers

paper | code

[1] Mobile-Former: Bridging MobileNet and Transformer(连接 MobileNet 和 Transformer)

keywords: Light-weight convolutional neural networks(轻量卷积神经网络),Combination of CNN and ViT

paper

图神经网络(GNN)

神经网络架构搜索(NAS)

[1] β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search(可微架构搜索的 Beta-Decay 正则化)

paper

[1] An Image Patch is a Wave: Quantum Inspired Vision MLP(图像补丁是波浪：量子启发的视觉 MLP)

paper | code | code

数据处理(Data Processing)

数据增广(Data Augmentation)

[2] TeachAugment: Data Augmentation Optimization Using Teacher Knowledge(使用教师知识进行数据增强优化)

paper ｜ code

[1] 3D Common Corruptions and Data Augmentation(3D 常见损坏和数据增强)

keywords: Data Augmentation, Image restoration, Photorealistic image synthesis

paper | projecr

归一化/正则化(Batch Normalization)

图像聚类(Image Clustering)

图像压缩(Image Compression)

异常检测(Anomaly Detection)

[2] Generative Cooperative Learning for Unsupervised Video Anomaly Detection(用于无监督视频异常检测的生成式协作学习)

paper

[1] Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection(用于异常检测的自监督预测卷积注意力块)(论文暂未上传)

paper | code

模型训练/泛化(Model Training/Generalization)

[4] Towards Efficient and Scalable Sharpness-Aware Minimization(迈向高效和可扩展的锐度感知最小化)

keywords: Sharp Local Minima, Large-Batch Training

paper

[3] CAFE: Learning to Condense Dataset by Aligning Features(通过对齐特征学习压缩数据集)

keywords: dataset condensation, coreset selection, generative models

paper | code

[2] The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration(魔鬼在边缘：用于网络校准的基于边缘的标签平滑)

paper | code

[1] DN-DETR: Accelerate DETR Training by Introducing Query DeNoising(通过引入查询去噪加速 DETR 训练)

keywords: Detection Transformer

paper | code

噪声标签(Noisy Label)

长尾分布(Long-Tailed Distribution)

[1] Targeted Supervised Contrastive Learning for Long-Tailed Recognition(用于长尾识别的有针对性的监督对比学习)

keywords: Long-Tailed Recognition(长尾识别), Contrastive Learning(对比学习)

paper

图像特征提取与匹配(Image feature extraction and matching)

[1] Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences(弱监督语义对应的概率扭曲一致性)

paper | code

模型评估(Model Evaluation)

多模态学习(Multi-Modal Learning)

视听学习(Audio-visual Learning)

视觉语言表征学习（Vision-language Representation Learning）

[4] L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) (Oral Presentation)

paper

[3] HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)

keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks

paper | project

[1] Vision-Language Pre-Training with Triple Contrastive Learning(三重对比学习的视觉语言预训练)

keywords: Vision-language representation learning, Contrastive Learning
paper | code

视觉预测(Vision-based Prediction)

[1] Motron: Multimodal Probabilistic Human Motion Forecasting(多模式概率人体运动预测)

paper

数据集(Dataset)

[2] Kubric: A scalable dataset generator(Kubric：可扩展的数据集生成器)

paper | code

[1] A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection(用于分段级视频复制检测的大规模综合数据集和复制重叠感知评估协议)

VCSL (Video Copy Segment Localization) dataset

paper | dataset, metric and benchmark codes

主动学习(Active Learning)

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

[2] Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification(小样本分类的相互集中学习)

paper

[1] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning(用于零样本学习的相互语义蒸馏网络)

keywords: Zero-Shot Learning, Knowledge Distillation

paper | code

持续学习(Continual Learning/Life-long Learning)

[1] On Generalizing Beyond Domains in Cross-Domain Continual Learning(关于跨域持续学习中的域外泛化)

paper

场景图(Scene Graph)

场景图生成(Scene Graph Generation)

[1] Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs(将视频场景图重新格式化为时间二分图)

keywords: Video Scene Graph Generation, Transformer, Video Grounding

paper | code

场景图预测(Scene Graph Prediction)

场景图理解(Scene Graph Understanding)

视觉定位(Visual Localization)

视觉推理/视觉问答(Visual Reasoning/VQA)

图像分类(Image Classification)

[1] GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction(用于多类别属性预测的基于全局、局部和内在的密集嵌入网络)

keywords: multi-label classification

paper | code | project

迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

[3] How Well Do Sparse Imagenet Models Transfer?(稀疏 Imagenet 模型的迁移效果如何？)

paper

[2] A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation(用于手语翻译的简单多模态迁移学习基线)

paper

[1] Weakly Supervised Object Localization as Domain Adaption(作为域适应的弱监督对象定位)

keywords: Weakly Supervised Object Localization(WSOL), Multi-instance learning based WSOL, Separated-structure based WSOL, Domain Adaption

paper | code

度量学习(Metric Learning)

[1] Enhancing Adversarial Robustness for Deep Metric Learning(增强深度度量学习的对抗鲁棒性)

keywords: Adversarial Attack, Adversarial Defense, Deep Metric Learning

paper

对比学习(Contrastive Learning)

[3] Selective-Supervised Contrastive Learning with Noisy Labels(带有噪声标签的选择性监督对比学习)

paper | code

[2] HCSC: Hierarchical Contrastive Selective Coding(分层对比选择性编码)

keywords: Self-supervised Representation Learning, Deep Clustering, Contrastive Learning

paper | code

[1] Crafting Better Contrastive Views for Siamese Representation Learning(为连体表示学习制作更好的对比视图)

paper | code

增量学习(Incremental Learning)

强化学习(Reinforcement Learning)

元学习(Meta Learning)

机器人(Robotic)

[1] IFOR: Iterative Flow Minimization for Robotic Object Rearrangement(IFOR：机器人对象重排的迭代流最小化)

paper | project

自监督学习/半监督学习(Self-supervised Learning/Semi-supervised Learning)

[2] Class-Aware Contrastive Semi-Supervised Learning(类感知对比半监督学习)

keywords: Semi-Supervised Learning, Self-Supervised Learning, Real-World Unlabeled Data Learning

paper

[1] A study on the distribution of social biases in self-supervised learning visual models(自监督学习视觉模型中social biases分布的研究)

paper

神经网络可解释性(Neural Network Interpretability)

[2] Do Explanations Explain? Model Knows Best(解释解释吗？模型最清楚)

paper

[1] Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks(神经网络中可解释的部分-整体层次结构和概念语义关系)

paper

人群计数(Crowd Counting)

[1] Boosting Crowd Counting via Multifaceted Attention(通过多方面注意提高人群计数)

paper | code

联邦学习(Federated Learning)

[1] Differentially Private Federated Learning with Local Regularization and Sparsification(局部正则化和稀疏化的差分私有联邦学习)

paper

Contrastive Conditional Neural Processes(对比条件神经过程)

paper

Deep Rectangling for Image Stitching: A Learning Baseline(图像拼接的深度矩形：学习基线)(Image Stitching)

paper | code

Online Learning of Reusable Abstract Models for Object Goal Navigation(对象目标导航可重用抽象模型的在线学习)

paper

PINA: Learning a Personalized Implicit Neural Avatar from a Single RGB-D Video Sequence(PINA：从单个 RGB-D 视频序列中学习个性化的隐式神经化身)

paper | video | project

2. CVPR2022 Oral

[1] L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) (视觉语言表征学习)

paper

3. CVPR2022 论文解读汇总

【3】即插即用！助力自监督涨点的ContrastiveCrop开源了！

【2】从原理和代码详解FAIR的惊艳之作：全新的纯卷积模型ConvNeXt

“文艺复兴” ConvNet卷土重来，压过Transformer！FAIR重新设计纯卷积新架构

【1】南开程明明团队和天大提出LD：目标检测的定位蒸馏

4. CVPR2022论文分享

5. To do list

CVPR2022 Workshop

0
0
14575

有用

CVPR 2022 最全整理：论文分方向汇总 / 代码 / 解读 / 直播 / 项目（更新中）【计算机...