README.md

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

@misc{you2019torchcv,
    author = {Ansheng You and Xiangtai Li and Zhen Zhu and Yunhai Tong},
    title = {TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision},
    howpublished = {\url{https://github.com/donnyyou/torchcv}},
    year = {2019}
}

This repository provides source code for most deep learning based cv problems. We'll do our best to keep this repository up-to-date. If you do find a problem about this repository, please raise an issue or submit a pull request.

Implemented Papers

Image Classification
- VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
- ResNet: Deep Residual Learning for Image Recognition
- DenseNet: Densely Connected Convolutional Networks
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
- Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
Semantic Segmentation
- DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
- PSPNet: Pyramid Scene Parsing Network
- DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
- Asymmetric Non-local Neural Networks for Semantic Segmentation
Object Detection
- SSD: Single Shot MultiBox Detector
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- YOLOv3: An Incremental Improvement
- FPN: Feature Pyramid Networks for Object Detection
Pose Estimation
- CPM: Convolutional Pose Machines
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Instance Segmentation
- Mask R-CNN
Generative Adversarial Networks
- Pix2pix: Image-to-Image Translation with Conditional Adversarial Nets
- CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.

QuickStart with TorchCV

Now only support Python3.x, pytorch 1.0.

pip3 install -r requirements.txt
cd extensions
sh make.sh

Performances with TorchCV

All the performances showed below fully reimplemented the papers' results.

Image Classification

ImageNet (Center Crop Test): 224x224

Model Train Test Top-1 Top-5 BS Iters Scripts ResNet50 train val 77.54 93.59 512 30W ResNet50 ResNet101 train val 78.94 94.56 512 30W ResNet101 ShuffleNetV2x0.5 train val 60.90 82.54 1024 40W ShuffleNetV2x0.5 ShuffleNetV2x1.0 train val 69.71 88.91 1024 40W ShuffleNetV2x1.0 DFNetV1 train val 70.99 89.68 1024 40W DFNetV1 DFNetV2 train val 74.22 91.61 1024 40W DFNetV2

Semantic Segmentation

Cityscapes (Single Scale Whole Image Test): Base LR 0.01, Crop Size 769

Model Backbone Train Test mIOU BS Iters Scripts PSPNet 3x3-Res101 train val 78.20 8 4W PSPNet DeepLabV3 3x3-Res101 train val 79.13 8 4W DeepLabV3

ADE20K (Single Scale Whole Image Test): Base LR 0.02, Crop Size 520

Model Backbone Train Test mIOU PixelACC BS Iters Scripts PSPNet 3x3-Res50 train val 41.52 80.09 16 15W PSPNet DeepLabv3 3x3-Res50 train val 42.16 80.36 16 15W DeepLabV3 PSPNet 3x3-Res101 train val 43.60 81.30 16 15W PSPNet DeepLabv3 3x3-Res101 train val 44.13 81.42 16 15W DeepLabV3

Object Detection

Pascal VOC2007/2012 (Single Scale Test): 20 Classes

Model Backbone Train Test mAP BS Epochs Scripts SSD300 VGG16 07+12_trainval 07_test 0.786 32 235 SSD300 SSD512 VGG16 07+12_trainval 07_test 0.808 32 235 SSD512 Faster R-CNN VGG16 07_trainval 07_test 0.706 1 15 Faster R-CNN

Pose Estimation

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Instance Segmentation

Mask R-CNN

Generative Adversarial Networks

Pix2pix
CycleGAN

DataSets with TorchCV

TorchCV has defined the dataset format of all the tasks which you could check in the subdirs of datasets. Following is an example dataset directory trees for training semantic segmentation. You could preprocess the open datasets with the scripts in folder datasets/seg/preprocess

DataSet
    train
        image
            00001.jpg/png
            00002.jpg/png
            ...
        label
            00001.png
            00002.png
            ...
    val
        image
            00001.jpg/png
            00002.jpg/png
            ...
        label
            00001.png
            00002.png
            ...

Commands with TorchCV

Take PSPNet as an example. ("tag" could be any string, include an empty one.)

Training

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag

Resume Training

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag

Validate

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh val tag

Testing:

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh test tag

Demos with TorchCV

Example output of VGG19-OpenPose

GitHub - donnyyou/torchcv: A PyTorch-Based Framework for Deep Learning in Comput...

README.md

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

Implemented Papers

QuickStart with TorchCV

Performances with TorchCV

Image Classification

Semantic Segmentation

Object Detection

Pose Estimation

Instance Segmentation

Generative Adversarial Networks

DataSets with TorchCV

Commands with TorchCV

Demos with TorchCV

Recommend

GitHub - deeplearning-ai/machine-learning-yearning-cn: Machine Learning Yearning...

最前线｜传“印度支付宝”Paytm将获蚂蚁金服、软银20亿美元融资

ofo还清蚂蚁欠款？消息人士：阿里系仍是ofo最大债主

Uber Go 风格指南（译）

苹果还需不需要 iPhone SE2？

墨迹天气CEO金犁内部信：上市被否不可怕，现金流充足

职场最无效建议：做你热爱的事

这座城市有15万种生活的可能，但属于你的“一生之宅”或许只有一个

你缺少的，是做成一件事情的能力

给老板提意见，怎么说死得最惨？

About Joyk