GitHub - xingyizhou/CenterNet: Object detection, 3D detection, and pose estimati...
source link: https://github.com/xingyizhou/CenterNet
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
README.md
Objects as Points
Object detection, 3D detection, and pose estimation using center point detection:
Objects as Points,
Xingyi Zhou, Dequan Wang, Philipp Krähenbühl,
arXiv technical report (arXiv 1904.07850)
Contact: [email protected]. Any questions or discussions are welcomed!
Abstract
Detection identifies objects as axis-aligned boxes in an image. Most successful object detectors enumerate a nearly exhaustive list of potential object locations and classify each. This is wasteful, inefficient, and requires additional post-processing. In this paper, we take a different approach. We model an object as a single point -- the center point of its bounding box. Our detector uses keypoint estimation to find center points and regresses to all other object properties, such as size, 3D location, orientation, and even pose. Our center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors. CenterNet achieves the best speed-accuracy trade-off on the MS COCO dataset, with 28.1% AP at 142 FPS, 37.4% AP at 52 FPS, and 45.1% AP with multi-scale testing at 1.4 FPS. We use the same approach to estimate 3D bounding box in the KITTI benchmark and human pose on the COCO keypoint dataset. Our method performs competitively with sophisticated multi-stage methods and runs in real-time.
Highlights
-
Simple: One-sentence method summary: use keypoint detection technic to detect the bounding box center point and regress to all other object properties like bounding box size, 3d information, and pose.
-
Versatile: The same framework works for object detection, 3d bounding box estimation, and multi-person pose estimation with minor modification.
-
Fast: The whole process in a single network feedforward. No NMS post processing is needed. Our DLA-34 model runs at 52 FPS with 37.4 COCO AP.
-
Strong: Our best single model achieves 45.1AP on COCO test-dev.
-
Easy to use: We provide user friendly testing API and webcam demos.
Main results
Object Detection on COCO validation
Backbone AP / FPS Flip AP / FPS Multi-scale AP / FPS Hourglass-104 40.3 / 14 42.2 / 7.8 45.1 / 1.4 DLA-34 37.4 / 52 39.2 / 28 41.7 / 4 ResNet-101 34.6 / 45 36.2 / 25 39.3 / 4 ResNet-18 28.1 / 142 30.0 / 71 33.2 / 12Keypoint detection on COCO validation
Backbone AP FPS Hourglass-104 64.0 6.6 DLA-34 58.9 233D bounding box detection on KITTI validation
Backbone FPS AP-E AP-M AP-H AOS-E AOS-M AOS-H BEV-E BEV-M BEV-H DLA-34 32 96.9 87.8 79.2 93.9 84.3 75.7 34.0 30.5 26.8All models and details are available in our Model zoo.
Installation
Please refer to INSTALL.md for installation instructions.
Use CenterNet
We support demo for image/ image folder, video, and webcam.
First, download the models (By default, ctdet_coco_dla_2x for detection and
multi_pose_dla_3x for human pose estimation)
from the Model zoo and put them if CenterNet_ROOT/models/
.
For object detection on images/ video, run:
python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth
We provide example images in CenterNet_ROOT/images/
(from Detectron). If set up correctly, the output should look like
For webcam demo, run
python demo.py ctdet --demo webcam --load_model ../models/ctdet_coco_dla_2x.pth
Similarly, for human pose estimation, run:
python demo.py multi_pose --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/multi_pose_dla_3x.pth
The result for the example images should look like:
You can add --debug 2
to visualize the heatmap outputs.
You can add --flip_test
for flip test.
To use this CenterNet in your own project, you can
import sys
CENTERNET_PATH = /path/to/CenterNet/src/lib/
sys.path.insert(0, CENTERNET_PATH)
from detectors.detector_factory import detector_factory
from opts import opts
MODEL_PATH = /path/to/model
TASK = 'ctdet' # or 'multi_pose' for human pose estimation
opt = opts().init('{} --load_model {}'.format(TASK, MODEL_PATH).split(' '))
detector = detector_factory[opt.task](opt)
img = image/or/path/to/your/image/
ret = detector.run(img)['results']
ret
will be a python dict: {category_id : [[x1, y1, x2, y2, score], ...], }
Benchmark Evaluation and Training
After installation, follow the instructions in DATA.md to setup the datasets. Then check GETTING_STARTED.md to reproduce the results in the paper. We provide scripts for all the experiments in the experiments folder.
Develop
If you are interested in training CenterNet in a new dataset, use CenterNet in a new task, or use a new network architecture for CenterNet, please refer to DEVELOP.md. Also feel free to send us emails for discussions or suggestions.
License
CenterNet itself is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from human-pose-estimation.pytorch (image transform, resnet), CornerNet (hourglassnet, loss functions), dla (DLA network), DCNv2(deformable convolutions), tf-faster-rcnn(Pascal VOC evaluation) and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).
Citation
If you find this project useful for your research, please use the following BibTeX entry.
@inproceedings{zhou2019objects,
title={Objects as Points},
author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
booktitle={arXiv preprint arXiv:1904.07850},
year={2019}
}
Recommend
-
201
Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in: Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, Yichen Wei,
-
39
README.md ExtremeNet: Training and Evaluation Code Code for bottom-up object detection by grouping extreme and center points:
-
43
README.md CenterNet: Keypoint Triplets for Object Detection by Kaiwen Duan, Song Bai, Lingxi Xie, Hongga...
-
29
我爱计算机视觉 标星,更快获取CVML新技术 本文经授权转载自极市平台(extrememart)。 作者简介 TeddyZhang :上海大学研究生
-
61
The CenterNet paper is a follow-up to the CornerNet . The CornerNet uses a pair of corner key-points to overcome the drawbacks of using anchor-based methods. However, the pe...
-
14
太长不看版 笔者重构了一版centernet(objects as points)的代码,并加入了蒸馏,多模型蒸馏,转caffe,转onnx,转tensorRT,把后处理也做到了网络前向当中,对落地非常的友好。 放一个centerX多模型蒸馏出...
-
14
Human Pose Detection using PyTorch Keypoint RCNN
-
22
此文章作为存档文章,caffe虽然不是c++版本运行CenterNet的最优方式,但也是一种选择。这里仅仅是记录,承接利用Caffe推理CenterNet(上篇)。 在上文中,虽然通过外挂l...
-
4
Why Anchor Free? 随着anchor在检测算法中的应用,不管是one or two stage的检测模型,都会在图片上放置密密麻麻尺寸不一的anchors,用来检测全图各个角落大小不一的目标物体。但是anchor based model有两个不足之处: Ancho...
-
8
Uncover Your Most Valuable Keywords with Aira’s New Keyword Opportunity Estimation Tool The author's views are entirely his or her own (excluding the unlikely event...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK