By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue.
This repo is the official implementation of BigDetection. It is based on mmdetection and CBNetV2.
We construct a new large-scale benchmark termed BigDetection. Our goal is to simply leverage the training data from existing datasets (LVIS, OpenImages and Object365) with carefully designed principles, and curate a larger dataset for improved detector pre-training. BigDetection dataset has 600 object categories and contains 3.4M training images with 36M object bounding boxes. We show some important statistics of BigDetection in the following figure.
 Left: Number of images per category of BigDetection. Right: Number of instances in different object sizes.
Left: Number of images per category of BigDetection. Right: Number of instances in different object sizes.
We show the evaluation results on BigDetection Validation. We hope BigDetection could serve as a new challenging benchmark for evaluating next-level object detection methods.
| Method | mAP (bigdet val) | Links | 
|---|---|---|
| YOLOv3 | 9.7 | model/config | 
| Deformable DETR | 13.1 | model/config | 
| Faster R-CNN (C4)* | 18.9 | model | 
| Faster R-CNN (FPN)* | 19.4 | model | 
| CenterNet2* | 23.1 | model | 
| Cascade R-CNN* | 24.1 | model | 
| CBNetV2-Swin-Base | 35.1 | model/config | 
We show the finetuning performance on COCO minival/test-dev. Results show that BigDetection pre-training provides significant benefits for different detector architectures. We achieve 59.8 mAP on COCO test-dev with a single model.
| Method | mAP (coco minival/test-dev) | Links | 
|---|---|---|
| YOLOv3 | 30.5/- | config | 
| Deformable DETR | 39.9/- | model/config | 
| Faster R-CNN (C4)* | 38.8/- | model | 
| Faster R-CNN (FPN)* | 40.5/- | model | 
| CenterNet2* | 45.3/- | model | 
| Cascade R-CNN* | 45.1/- | model | 
| CBNetV2-Swin-Base | 59.1/59.5 | model/config | 
| CBNetV2-Swin-Base (TTA) | 59.5/59.8 | config | 
We followed STAC and SoftTeacher to evaluate on COCO for different partial annotation settings.
| Method | mAP (1%) | mAP (2%) | mAP (5%) | mAP (10%) | 
|---|---|---|---|---|
| Baseline | 9.8 | 14.3 | 21.2 | 26.2 | 
| STAC | 14.0 | 18.3 | 24.4 | 28.6 | 
| SoftTeacher (ICCV 21) | 20.5 | 26.5 | 30.7 | 34.0 | 
| Ours | 25.3 | 28.1 | 31.9 | 34.1 | 
| model | model | model | model | 
- The models following *are implemented on another detection codebase Detectron2. Here we provide the pretrained checkpoints. The results can be reproduced following the installation of CenterNet2 codebase.
- Most of models are trained for 8Xschedule on BigDetection.
- Most of pretrained models are finetuned for 1Xschedule on COCO.
- TTAdenotes test time augmentation.
- Pre-trained models of Swin Transformer can be downloaded from Swin Transformer for ImageNet Classification.
- Ubuntu 16.04
- CUDA 10.2
# Create conda environment
conda create -n bigdet python=3.7 -y
conda activate bigdet
# Install Pytorch
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch
# Install mmcv
pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
# Clone and install
git clone https://github.com/amazon-research/bigdetection.git
cd bigdetection
pip install -r requirements/build.txt
pip install -v -e .
# Install Apex (optinal)
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
Our BigDetection involves 3 datasets and train/val data can be downloaded from their official website (Objects365, OpenImages v6, LVIS v1.0). All datasets should be placed under $bigdetection/data/ as below. The synsets (total 600 class names) of BigDetection dataset can be downloaded here: bigdetection_synsets. Contact us with lkcai20@fudan.edu.cn to get access to our pre-processed annotation files.
bigdetection/data
└── BigDetection
    ├── annotations
    │   ├── bigdet_obj_train.json
    │   ├── bigdet_oid_train.json
    │   ├── bigdet_lvis_train.json
    │   ├── bigdet_val.json
    │   └── cas_weights.json
    ├── train
    │   ├── Objects365
    │   ├── OpenImages
    │   └── LVIS
    └── val
To train a detector with pre-trained models, run:
# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> --cfg-options load_from=<PRETRAIN_MODEL>
Pre-training
To pre-train a CBNetV2 with a Swin-Base backbone on BigDetection using 8 GPUs, run: (PRETRAIN_MODEL should be pre-trained checkpoint of Base-Swin-Transformer: model)
tools/dist_train.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py 8 \
    --cfg-options load_from=<PRETRAIN_MODEL>
To pre-train a Deformable-DETR with a ResNet-50 backbone on BigDetection, run:
tools/dist_train.sh configs/BigDetection/deformable_detr/deformable_detr_r50_16x2_8x_bigdet.py 8
Fine-tuning
To fine-tune a BigDetection pre-trained CBNetV2 (with Swin-Base backbone) on COCO, run: (PRETRAIN_MODEL should be BigDetection pre-trained checkpoint of CBNetV2: model)
tools/dist_train.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco.py 8 \
    --cfg-options load_from=<PRETRAIN_MODEL>
To evaluate a detector with pre-trained checkpoints, run:
tools/dist_test.sh <CONFIG_FILE> <CHECKPOINT> <GPU_NUM> --eval bbox
BigDetection evaluation
To evaluate pre-trained CBNetV2 on BigDetection validation, run:
tools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py \
    <BIGDET_PRETRAIN_CHECKPOINT> 8 --eval bbox
COCO evaluation
To evaluate COCO-finetuned CBNetV2 on COCO validation, run:
# without test-time-augmentation
tools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco.py \
    <COCO_FINETUNE_CHECKPOINT> 8 --eval bbox mask
# with test-time-augmentation
tools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco_tta.py \
    <COCO_FINETUNE_CHECKPOINT> 8 --eval bbox mask
Other configuration based on Detectron2 can be found at detectron2-probject.
If you use our dataset or pretrained models in your research, please kindly consider to cite the following paper.
@article{bigdetection2022,
  title={BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training},
  author={Likun Cai and Zhi Zhang and Yi Zhu and Li Zhang and Mu Li and Xiangyang Xue},
  journal={arXiv preprint arXiv:2203.13249},
  year={2022}
}
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.
We thank the authors releasing mmdetection and CBNetv2 for object detection research community.
