## 总体架构

All the convolutions not marked with “V” in the figures are same-padded meaning that their output grid matches the size of their input. Convolutions marked with “V” are valid padded, meaning that input patch of each unit is fully contained in the previous layer and the grid size of the output activation map is reduced accordingly

Note：上图显示的输出大小参考的是Inception-ResNet-v1架构

## Stem

• $$Input = 3\times 299\times 299$$
• $$Output = 384\times 35\times 35$$

## Inception-ResNet-A

• $$Input = 384\times 35\times 35$$
• $$Output = 384\times 35\times 35$$

## Reduction-A

• $$Input = 384\times 35\times 35$$
• $$Output = 1152\times 17\times 17$$

## Inception-ResNet-B

• $$Input = 1152\times 17\times 17$$
• $$Output = 1154\times 17\times 17$$

## Reduction-B

• $$Input = 1154\times 17\times 17$$
• $$Output = 2146\times 8\times 8$$

## Inception-ResNet-C

• $$Input = 2146\times 8\times 8$$
• $$Output = 2048\times 8\times 8$$