## 摘要

Data augmentation is an effective technique for improving the accuracy of modern image classifiers. However, current data augmentation implementations are manually designed. In this paper, we describe a simple procedure called AutoAugment to automatically search for improved data augmentation policies. In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch. A sub-policy consists of two operations, each operation being an image processing function such as translation, rotation, or shearing, and the probabilities and magnitudes with which the functions are applied. We use a search algorithm to find the best policy such that the neural network yields the highest validation accuracy on a target dataset. Our method achieves state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On ImageNet, we attain a Top-1 accuracy of 83.5% which is 0.4% better than the previous record of 83.1%. On CIFAR-10, we achieve an error rate of 1.5%, which is 0.6% better than the previous state-of-the-art. Augmentation policies we find are transferable between datasets. The policy learned on ImageNet transfers well to achieve significant improvements on other datasets, such as Oxford Flowers, Caltech-101, Oxford-IIT Pets, FGVC Aircraft, and Stanford Cars.

## 引言

Intuitively, data augmentation is used to teach a model about invariances in the data domain: classifying an object is often insensitive to horizontal flips or translation. Network architectures can also be used to hardcode invariances: convolutional networks bake in translation invariance [16, 32, 25, 29]. However, using data augmentation to incorporate potential invariances can be easier than hardcoding invariances into the model architecture directly.

## AutoAugment实现

### 整体架构

1. 搜索空间：使用多少个基础图像扩充功能，每个图像扩充拥有几个超参数等
2. 搜索算法：论文使用强化学习作为搜索算法（即使那个controller，控制策略采样方向

1. 搜索算法采集数据扩充策略$$S$$
2. 使用固定的网络架构和训练流程，配合该搜索策略进行训练；
3. 训练完成后评估得到验证精度$$R$$
4. 利用$$R$$对搜索算法进行更新，重复第一步

### 搜索空间

#### 基本操作

ShearX/Y, TranslateX/Y, Rotate, AutoContrast, Invert, Equalize, Solarize, Posterize, Contrast, Color, Brightness, Sharpness, Cutout , Sample Pairing

### 搜索算法

．．．

#### 小结

1. 控制器每次生成30softmax预测，分别用于5个子策略以及每个策略的策略类型、发生概率和幅度值
2. 对于每个数据集，控制器大约会采集15,000个策略
3. 完成搜索后，将5个最好验证精度的策略（共25个子策略）级联在一起，作为数据集的数据扩充实现

## 训练和结果

1. AutoAugment-direct：针对某个数据集进行搜索得到一组数据增强策略
2. AutoAugment-transfer：将其他数据集搜索得到的数据增强策略作用于该数据集，验证自动增强算法得到的数据增强策略的可迁移性

### AutoAugment-direct

#### 训练策略

1. 用于CIFAR10
1. 模型：Wide-ResNet-40-240层，宽度因子为2
2. 训练次数：120
3. 权重衰减：1e-4
4. 学习率：0.01
5. 学习率策略：一个退火周期的余弦学习衰减
2. 用于ImageNet
1. 模型：Wide-ResNet-40-240层，宽度因子为2
2. 训练次数：200
3. 权重衰减：1e-5
4. 学习率：0.1
5. 学习率策略：一个退火周期的余弦学习衰减

#### 分析

1. CIFAR-10上，AutoAugment得到的更多的是基于颜色的转换，比如Equalize, AutoContrast, Color, and Brightness；而很少使用类似于ShearX/Y的几何操作
2. SVHN上，AutoAugment得到的更多的是基于几何的转换，比如Invert, Equalize, ShearX/Y, and Rotate。论文给出的解释是数据集中的门牌号通常是自然剪切和倾斜的，所以使用基于几何的数据扩充可以提高几何变换的不变性
3. ImageNet上，基于颜色的转换和基于几何的转换都很常见