[数据集]PASCAL VOC 2007

相关实现:zjykzj/vocdev

简介

PASCAL VOC 2007数据集基于4个大类别,共包含了20个目标类:

  • Person: person
  • Animal: bird, cat, cow(奶牛), dog, horse, sheep(绵羊)
  • Vehicle(交通工具): aeroplane(飞机), bicycle, boat(小船), bus(公共汽车), car(轿车), motorbike(摩托车), train(火车)
  • Indoor(室内): bottle(瓶子), chair(椅子), dining table(餐桌), potted plant(盆栽植物), sofa, tv/monitor(电视/显示器)

PASCAL VOC 2007数据集主要用于分类/测试任务,同时也提供了分割和人体部件检测的数据。示例如下:

类别

从标注文件中提取20个类别名,并按首字母排序

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def find_all_cates(annotation_dir):
"""
找出所有的类别,按字母排序
"""
annotation_list = os.listdir(annotation_dir)
# print(annotation_list)

cate_list = list()
for name in annotation_list:
annotation_path = os.path.join(annotation_dir, name)
with open(annotation_path, 'rb') as f:
xml_dict = xmltodict.parse(f)
# print(xml_dict)

objects = xml_dict['annotation']['object']
if isinstance(objects, list):
for obj in objects:
obj_name = obj['name']
if obj_name not in cate_list:
cate_list.append(obj_name)
elif isinstance(objects, dict):
obj_name = objects['name']
if obj_name not in cate_list:
cate_list.append(obj_name)
else:
pass

# 排序
cate_list = sorted(cate_list)
# print(cate_list)

return cate_list

结果如下:

1
['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']

数据

  1. 标注准则参考Annotation Guidelines
  2. 详细的训练/验证数据集的个数参考Database Statistics

通过标注文件的方式提供了训练/验证/测试集的数据。整个数据集分为50%的训练/验证集以及50%的测试集。总共有9963幅图像,包含24640个标注对象,具体信息如下

  • 训练数据:2501张图像,共6301个目标
  • 验证数据:2510张图像,共6307个目标
  • 训练+验证数据:5011张图像,共12608个目标

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# -*- coding: utf-8 -*-

"""
@author: zj
@file: pascal-voc.py
@time: 2020-01-19
"""

import cv2
import numpy as np
from torchvision.datasets import VOCDetection


def show_data():
dataset = VOCDetection('./data', year='2007', image_set='trainval')

for i in range(10):
img, target = dataset.__getitem__(i)
img = np.array(img)
# torchvision用PIL库,以RGB格式读取图像,而OpenCV以BGR格式处理图像,先进行转换
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)

print(img.shape)
print(target)

objects = target['annotation']['object']
if isinstance(objects, list):
for obj in objects:
print(obj)
bndbox = obj['bndbox']
cv2.rectangle(img, (int(bndbox['xmin']), int(bndbox['ymin'])),
(int(bndbox['xmax']), int(bndbox['ymax'])), (0, 255, 0), thickness=1)
cv2.putText(img, obj['name'], (int(bndbox['xmin']), int(bndbox['ymin'])),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))
elif isinstance(objects, dict):
bndbox = objects['bndbox']
cv2.rectangle(img, (int(bndbox['xmin']), int(bndbox['ymin'])),
(int(bndbox['xmax']), int(bndbox['ymax'])), (0, 255, 0), thickness=1)
cv2.putText(img, objects['name'], (int(bndbox['xmin']), int(bndbox['ymin'])),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))
else:
pass

cv2.imshow(target['annotation']['filename'], img)
cv2.waitKey(0)


if __name__ == '__main__':
show_data()

下载

训练相关

测试相关

解析标注数据

参考:[python]读取XML文件

VOC数据集的图像保存在文件夹JPEGImages中,标注数据保存在Annotations

编写如下代码解析标注数据,将训练/验证/测试数据从原图像中提取出来

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# -*- coding: utf-8 -*-

"""
@author: zj
@file: batch_xml.py
@time: 2019-12-07
"""

import cv2
import os
import xml.etree.cElementTree as ET

train_xml_dir = '/home/zj/data/PASCAL-VOC/2007/train/Annotations'
train_jpeg_dir = '/home/zj/data/PASCAL-VOC/2007/train/JPEGImages'

test_xml_dir = '/home/zj/data/PASCAL-VOC/2007/test/Annotations'
test_jpeg_dir = '/home/zj/data/PASCAL-VOC/2007/test/JPEGImages'

# 标注图像保存路径
train_imgs_dir = '/home/zj/data/PASCAL-VOC/2007/train_imgs'
test_imgs_dir = '/home/zj/data/PASCAL-VOC/2007/test_imgs'


def parse_xml(xml_path):
tree = ET.ElementTree(file=xml_path)
root = tree.getroot()

img_name = ''
obj_list = list()
bndbox_list = list()

# 遍历根节点下所有节点,查询文件名和目标坐标
for child_node in root:
if 'filename'.__eq__(child_node.tag):
img_name = child_node.text
if 'object'.__eq__(child_node.tag):
obj_name = ''
for obj_node in child_node:
if 'name'.__eq__(obj_node.tag):
obj_name = obj_node.text
if 'bndbox'.__eq__(obj_node.tag):
node_bndbox = obj_node

node_xmin = node_bndbox[0]
node_ymin = node_bndbox[1]
node_xmax = node_bndbox[2]
node_ymax = node_bndbox[3]

obj_list.append(obj_name)
bndbox_list.append((
int(node_xmin.text), int(node_ymin.text), int(node_xmax.text), int(node_ymax.text)))

return img_name, obj_list, bndbox_list


def batch_parse(xml_dir, jpeg_dir, imgs_dir):
xml_list = os.listdir(xml_dir)
jepg_list = os.listdir(jpeg_dir)

for xml_name in xml_list:
xml_path = os.path.join(xml_dir, xml_name)
img_name, obj_list, bndbox_list = parse_xml(xml_path)
print(img_name, obj_list, bndbox_list)

if img_name in jepg_list:
img_path = os.path.join(jpeg_dir, img_name)
src = cv2.imread(img_path)
for i in range(len(obj_list)):
obj_name = obj_list[i]
bndbox = bndbox_list[i]

obj_dir = os.path.join(imgs_dir, obj_name)
if not os.path.exists(obj_dir):
os.mkdir(obj_dir)
obj_path = os.path.join(obj_dir, '%s-%s-%d-%d-%d-%d.png' % (
img_name, obj_name, bndbox[0], bndbox[1], bndbox[2], bndbox[3]))

res = src[bndbox[1]:bndbox[3], bndbox[0]:bndbox[2]]
cv2.imwrite(obj_path, res)


if __name__ == '__main__':
batch_parse(train_xml_dir, train_jpeg_dir, train_imgs_dir)
batch_parse(test_xml_dir, test_jpeg_dir, test_imgs_dir)

通过解析XML文件,获取图像名以及标注的目标名和边界框数据;通过OpenCV读取图像,截取图像后保存在指定类别文件夹

引用

如果利用了VOC 2007数据,可以引用(citation)以下参考信息:

1
2
3
4
@misc{pascal-voc-2007,
author = "Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.",
title = "The {PASCAL} {V}isual {O}bject {C}lasses {C}hallenge 2007 {(VOC2007)} {R}esults",
howpublished = "http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html"}

相关阅读