We use AWS as the main site to host our model zoo, and maintain a mirror on aliyun.
You can replace https://s3.ap-northeast-2.amazonaws.com/open-mmlab
with https://open-mmlab.oss-cn-beijing.aliyuncs.com
in model urls.
coco_2017_train
, and tested on the coco_2017_val
.torch.cuda.max_memory_allocated()
for all 8 GPUs. Note that this value is usually less than what nvidia-smi
shows.More models with different backbones will be added to the model zoo.
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | AR1000 | Download |
---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | - | - | 20.5 | 51.1 | model |
R-50-C4 | caffe | 2x | 2.2 | 0.17 | 20.3 | 52.2 | model |
R-50-C4 | pytorch | 1x | - | - | 20.1 | 50.2 | model |
R-50-C4 | pytorch | 2x | - | - | 20.0 | 51.1 | model |
R-50-FPN | caffe | 1x | 3.3 | 0.253 | 16.9 | 58.2 | - |
R-50-FPN | pytorch | 1x | 3.5 | 0.276 | 17.7 | 57.1 | model |
R-50-FPN | pytorch | 2x | - | - | - | 57.6 | model |
R-101-FPN | caffe | 1x | 5.2 | 0.379 | 13.9 | 59.4 | - |
R-101-FPN | pytorch | 1x | 5.4 | 0.396 | 14.4 | 58.6 | model |
R-101-FPN | pytorch | 2x | - | - | - | 59.1 | model |
X-101-32x4d-FPN | pytorch | 1x | 6.6 | 0.589 | 11.8 | 59.4 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 59.9 | model |
X-101-64x4d-FPN | pytorch | 1x | 9.5 | 0.955 | 8.3 | 59.8 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 60.0 | model |
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | - | - | 9.5 | 34.9 | model |
R-50-C4 | caffe | 2x | 4.0 | 0.39 | 9.3 | 36.5 | model |
R-50-C4 | pytorch | 1x | - | - | 9.3 | 33.9 | model |
R-50-C4 | pytorch | 2x | - | - | 9.4 | 35.9 | model |
R-50-FPN | caffe | 1x | 3.6 | 0.333 | 13.5 | 36.6 | - |
R-50-FPN | pytorch | 1x | 3.8 | 0.353 | 13.6 | 36.4 | model |
R-50-FPN | pytorch | 2x | - | - | - | 37.7 | model |
R-101-FPN | caffe | 1x | 5.5 | 0.465 | 11.5 | 38.8 | - |
R-101-FPN | pytorch | 1x | 5.7 | 0.474 | 11.9 | 38.5 | model |
R-101-FPN | pytorch | 2x | - | - | - | 39.4 | model |
X-101-32x4d-FPN | pytorch | 1x | 6.9 | 0.672 | 10.3 | 40.1 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 40.4 | model |
X-101-64x4d-FPN | pytorch | 1x | 9.8 | 1.040 | 7.3 | 41.3 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 40.7 | model |
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | - | - | 8.1 | 35.9 | 31.5 | model |
R-50-C4 | caffe | 2x | 4.2 | 0.43 | 8.1 | 37.9 | 32.9 | model |
R-50-C4 | pytorch | 1x | - | - | 7.9 | 35.1 | 31.2 | model |
R-50-C4 | pytorch | 2x | - | - | 8.0 | 37.2 | 32.5 | model |
R-50-FPN | caffe | 1x | 3.8 | 0.430 | 10.2 | 37.4 | 34.3 | - |
R-50-FPN | pytorch | 1x | 3.9 | 0.453 | 10.6 | 37.3 | 34.2 | model |
R-50-FPN | pytorch | 2x | - | - | - | 38.5 | 35.1 | model |
R-101-FPN | caffe | 1x | 5.7 | 0.534 | 9.4 | 39.9 | 36.1 | - |
R-101-FPN | pytorch | 1x | 5.8 | 0.571 | 9.5 | 39.4 | 35.9 | model |
R-101-FPN | pytorch | 2x | - | - | - | 40.3 | 36.5 | model |
X-101-32x4d-FPN | pytorch | 1x | 7.1 | 0.759 | 8.3 | 41.1 | 37.1 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 41.4 | 37.1 | model |
X-101-64x4d-FPN | pytorch | 1x | 10.0 | 1.102 | 6.5 | 42.1 | 38.0 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 42.0 | 37.7 | model |
Backbone | Style | Type | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|---|
R-50-C4 | caffe | Faster | 1x | - | - | 6.7 | 35.0 | - | model |
R-50-C4 | caffe | Faster | 2x | 3.8 | 0.34 | 6.6 | 36.4 | - | model |
R-50-C4 | pytorch | Faster | 1x | - | - | 6.3 | 34.2 | - | model |
R-50-C4 | pytorch | Faster | 2x | - | - | 6.1 | 35.8 | - | model |
R-50-FPN | caffe | Faster | 1x | 3.3 | 0.242 | 18.4 | 36.6 | - | - |
R-50-FPN | pytorch | Faster | 1x | 3.5 | 0.250 | 16.5 | 35.8 | - | model |
R-50-C4 | caffe | Mask | 1x | - | - | 8.1 | 35.9 | 31.5 | model |
R-50-C4 | caffe | Mask | 2x | 4.2 | 0.43 | 8.1 | 37.9 | 32.9 | model |
R-50-C4 | pytorch | Mask | 1x | - | - | 7.9 | 35.1 | 31.2 | model |
R-50-C4 | pytorch | Mask | 2x | - | - | 8.0 | 37.2 | 32.5 | model |
R-50-FPN | pytorch | Faster | 2x | - | - | - | 37.1 | - | model |
R-101-FPN | caffe | Faster | 1x | 5.2 | 0.355 | 14.4 | 38.6 | - | - |
R-101-FPN | pytorch | Faster | 1x | 5.4 | 0.388 | 13.2 | 38.1 | - | model |
R-101-FPN | pytorch | Faster | 2x | - | - | - | 38.8 | - | model |
R-50-FPN | caffe | Mask | 1x | 3.4 | 0.328 | 12.8 | 37.3 | 34.5 | - |
R-50-FPN | pytorch | Mask | 1x | 3.5 | 0.346 | 12.7 | 36.8 | 34.1 | model |
R-50-FPN | pytorch | Mask | 2x | - | - | - | 37.9 | 34.8 | model |
R-101-FPN | caffe | Mask | 1x | 5.2 | 0.429 | 11.2 | 39.4 | 36.1 | - |
R-101-FPN | pytorch | Mask | 1x | 5.4 | 0.462 | 10.9 | 38.9 | 35.8 | model |
R-101-FPN | pytorch | Mask | 2x | - | - | - | 39.9 | 36.4 | model |
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|
R-50-FPN | caffe | 1x | 3.4 | 0.285 | 12.5 | 35.8 | - |
R-50-FPN | pytorch | 1x | 3.6 | 0.308 | 12.1 | 35.6 | model |
R-50-FPN | pytorch | 2x | - | - | - | 36.4 | model |
R-101-FPN | caffe | 1x | 5.3 | 0.410 | 10.4 | 37.8 | - |
R-101-FPN | pytorch | 1x | 5.5 | 0.429 | 10.9 | 37.7 | model |
R-101-FPN | pytorch | 2x | - | - | - | 38.1 | model |
X-101-32x4d-FPN | pytorch | 1x | 6.7 | 0.632 | 9.3 | 39.0 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 39.3 | model |
X-101-64x4d-FPN | pytorch | 1x | 9.6 | 0.993 | 7.0 | 40.0 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 39.6 | model |
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | 8.7 | 0.92 | 5.0 | 38.7 | model |
R-50-FPN | caffe | 1x | 3.9 | 0.464 | 10.9 | 40.5 | - |
R-50-FPN | pytorch | 1x | 4.1 | 0.455 | 11.9 | 40.4 | model |
R-50-FPN | pytorch | 20e | - | - | - | 41.1 | model |
R-101-FPN | caffe | 1x | 5.8 | 0.569 | 9.6 | 42.4 | - |
R-101-FPN | pytorch | 1x | 6.0 | 0.584 | 10.3 | 42.0 | model |
R-101-FPN | pytorch | 20e | - | - | - | 42.5 | model |
X-101-32x4d-FPN | pytorch | 1x | 7.2 | 0.770 | 8.9 | 43.6 | model |
X-101-32x4d-FPN | pytorch | 20e | - | - | - | 44.0 | model |
X-101-64x4d-FPN | pytorch | 1x | 10.0 | 1.133 | 6.7 | 44.5 | model |
X-101-64x4d-FPN | pytorch | 20e | - | - | - | 44.7 | model |
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | 9.1 | 0.99 | 4.5 | 39.3 | 32.8 | model |
R-50-FPN | caffe | 1x | 5.1 | 0.692 | 7.6 | 40.9 | 35.5 | - |
R-50-FPN | pytorch | 1x | 5.3 | 0.683 | 7.4 | 41.2 | 35.7 | model |
R-50-FPN | pytorch | 20e | - | - | - | 42.3 | 36.6 | model |
R-101-FPN | caffe | 1x | 7.0 | 0.803 | 7.2 | 43.1 | 37.2 | - |
R-101-FPN | pytorch | 1x | 7.2 | 0.807 | 6.8 | 42.6 | 37.0 | model |
R-101-FPN | pytorch | 20e | - | - | - | 43.3 | 37.6 | model |
X-101-32x4d-FPN | pytorch | 1x | 8.4 | 0.976 | 6.6 | 44.4 | 38.2 | model |
X-101-32x4d-FPN | pytorch | 20e | - | - | - | 44.7 | 38.6 | model |
X-101-64x4d-FPN | pytorch | 1x | 11.4 | 1.33 | 5.3 | 45.4 | 39.1 | model |
X-101-64x4d-FPN | pytorch | 20e | - | - | - | 45.7 | 39.4 | model |
Notes:
20e
schedule in Cascade (Mask) R-CNN indicates decreasing the lr at 16 and 19 epochs, with a total of 20 epochs.Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|
R-50-FPN | pytorch | 1x | 7.4 | 0.936 | 4.1 | 42.1 | 37.3 | model |
R-50-FPN | pytorch | 20e | - | - | - | 43.2 | 38.1 | model |
R-101-FPN | pytorch | 20e | 9.3 | 1.051 | 4.0 | 44.9 | 39.4 | model |
X-101-32x4d-FPN | pytorch | 20e | 5.8 | 0.769 | 3.8 | 46.1 | 40.3 | model |
X-101-64x4d-FPN | pytorch | 20e | 7.5 | 1.120 | 3.5 | 46.9 | 40.8 | model |
Notes:
Backbone | Size | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|---|
VGG16 | 300 | caffe | 120e | 3.5 | 0.256 | 25.9 / 34.6 | 25.7 | model |
VGG16 | 512 | caffe | 120e | 7.6 | 0.412 | 20.7 / 25.4 | 29.3 | model |
Backbone | Size | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|---|
VGG16 | 300 | caffe | 240e | 2.5 | 0.159 | 35.7 / 53.6 | 77.5 | model |
VGG16 | 512 | caffe | 240e | 4.3 | 0.214 | 27.5 / 35.9 | 80.0 | model |
Notes:
cudnn.benchmark
is set as True
for SSD training and testing.Please refer to Group Normalization for details.
Please refer to Weight Standardization for details.
Please refer to Deformable Convolutional Networks for details.
Please refer to Libra R-CNN for details.
Please refer to Guided Anchoring for details.
We compare mmdetection with Detectron and maskrcnn-benchmark. The backbone used is R-50-FPN.
In general, mmdetection has 3 advantages over Detectron.
Detectron and maskrcnn-benchmark use caffe-style ResNet as the backbone. We report results using both caffe-style (weights converted from here) and pytorch-style (weights from the official model zoo) ResNet backbone, indicated as pytorch-style results / caffe-style results.
We find that pytorch-style ResNet usually converges slower than caffe-style ResNet, thus leading to slightly lower results in 1x schedule, but the final results of 2x schedule is higher.
Type | Lr schd | Detectron | maskrcnn-benchmark | mmdetection |
---|---|---|---|---|
RPN | 1x | 57.2 | - | 57.1 / 58.2 |
2x | - | - | 57.6 / - | |
Faster R-CNN | 1x | 36.7 | 36.8 | 36.4 / 36.6 |
2x | 37.9 | - | 37.7 / - | |
Mask R-CNN | 1x | 37.7 & 33.9 | 37.8 & 34.2 | 37.3 & 34.2 / 37.4 & 34.3 |
2x | 38.6 & 34.5 | - | 38.5 & 35.1 / - | |
Fast R-CNN | 1x | 36.4 | - | 35.8 / 36.6 |
2x | 36.8 | - | 37.1 / - | |
Fast R-CNN (w/mask) | 1x | 37.3 & 33.7 | - | 36.8 & 34.1 / 37.3 & 34.5 |
2x | 37.7 & 34.0 | - | 37.9 & 34.8 / - |
The training speed is measure with s/iter. The lower, the better.
Type | Detectron (P1001) | maskrcnn-benchmark (V100) | mmdetection (V1002) |
---|---|---|---|
RPN | 0.416 | - | 0.253 |
Faster R-CNN | 0.544 | 0.353 | 0.333 |
Mask R-CNN | 0.889 | 0.454 | 0.430 |
Fast R-CNN | 0.285 | - | 0.242 |
Fast R-CNN (w/mask) | 0.377 | - | 0.328 |
*1. Facebook's Big Basin servers (P100/V100) is slightly faster than the servers we use. mmdetection can also run slightly faster on FB's servers.
*2. For fair comparison, we list the caffe-style results here.
The inference speed is measured with fps (img/s) on a single GPU. The higher, the better.
Type | Detectron (P100) | maskrcnn-benchmark (V100) | mmdetection (V100) |
---|---|---|---|
RPN | 12.5 | - | 16.9 |
Faster R-CNN | 10.3 | 7.9 | 13.5 |
Mask R-CNN | 8.5 | 7.7 | 10.2 |
Fast R-CNN | 12.5 | - | 18.4 |
Fast R-CNN (w/mask) | 9.9 | - | 12.8 |
Type | Detectron | maskrcnn-benchmark | mmdetection |
---|---|---|---|
RPN | 6.4 | - | 3.3 |
Faster R-CNN | 7.2 | 4.4 | 3.6 |
Mask R-CNN | 8.6 | 5.2 | 3.8 |
Fast R-CNN | 6.0 | - | 3.3 |
Fast R-CNN (w/mask) | 7.9 | - | 3.4 |
There is no doubt that maskrcnn-benchmark and mmdetection is more memory efficient than Detectron, and the main advantage is PyTorch itself. We also perform some memory optimizations to push it forward.
Note that Caffe2 and PyTorch have different apis to obtain memory usage with different implementations.
For all codebases, nvidia-smi
shows a larger memory usage than the reported number in the above table.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。