[Detectron2]安装和训练

Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up rewrite of the previous version, Detectron, and it originates from maskrcnn-benchmark.

花了不少时间熟悉Detectron2的安装和训练,期间也遇到了不少问题,小结一下

在线文档地址:detectron2

安装

Detectron2提供了多种安装方式,当前选用本地源码编译安装

Requirements

1
2
3
4
5
Linux or macOS with Python ≥ 3.6
PyTorch ≥ 1.4
torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this.
OpenCV, optional, needed by demo and visualization
pycocotools: pip install cython; pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

源码安装

1
2
3
4
5
6
7
8
9
10
$ git clone https://github.com/facebookresearch/detectron2.git
$ python -m pip install -e detectron2
。。。
。。。
Installing collected packages: detectron2
Found existing installation: detectron2 0.1.3
Uninstalling detectron2-0.1.3:
Successfully uninstalled detectron2-0.1.3
Running setup.py develop for detectron2
Successfully installed detectron2

训练

Detectron2toolsdemo文件夹下提供了不少的使用示例,并且在configs文件夹下配置了不少算法

下面使用PASCAL VOC数据集训练Faster R-CNN+ResNet-50

配置数据集

解析VOC 07/12数据集,使用环境变量DETECTRON2_DATASETS指定数据路径

1
2
3
$ ls
VOC2007 VOC2012
$ export DETECTRON2_DATASETS=...

训练

当前仅使用单块GPU,需要进行显式参数设置

1
2
$ cd tools
$ ./train_net.py --config-file ../configs/PascalVOC-Detection/faster_rcnn_R_50_C4.yaml --num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025

问题

训练过程中出现了如下问题:

1
2
3
4
5
6
7
8
9
[06/14 17:31:31 d2.engine.train_loop]: Starting training from iteration 0
ERROR [06/14 17:31:32 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
。。。
。。。
roi_align.py", line 20, in forward
input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned
RuntimeError: CUDA error: invalid device function
段错误 (核心已转储)

在网上找了很久,发现是由于CUDAcudatoolkit的版本不一致关系。参考

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda #62

安装哪个版本的CUDA

重新安装了PyTorchcudatoolkit

1
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch

再次编译Detectron2,问题解决