d-li14 / SAN

Scale Adaptive Network

Official implementation of Scale Adaptive Network (SAN) as described in Learning to Learn Parameterized Classification Networks for Scalable Input Images (ECCV'20) by Duo Li, Anbang Yao and Qifeng Chen on the ILSVRC 2012 benchmark.

We present a meta learning framework which dynamically parameterizes main networks conditioned on its input resolution at runtime, leading to efficient and flexible inference for arbitrarily switchable input resolutions.

Requirements

Dependency

PyTorch 1.0+
NVIDIA-DALI (in development, not recommended)

Dataset

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Pre-trained Models

Baseline (individually trained on each resolution)

ResNet-18

Resolution	Top-1 Acc.	Download
224x224	70.974	Google Drive
192x192	69.754	Google Drive
160x160	68.482	Google Drive
128x128	66.360	Google Drive
96x96	62.560	Google Drive

ResNet-50

Resolution	Top-1 Acc.	Download
224x224	77.150	Google Drive
192x192	76.406	Google Drive
160x160	75.312	Google Drive
128x128	73.526	Google Drive
96x96	70.610	Google Drive

MobileNetV2

Please visit my repository mobilenetv2.pytorch.

SAN

Architecture	Download
ResNet-18	Google Drive
ResNet-50	Google Drive
MobileNetV2	Google Drive

Training

ResNet-18/50

python imagenet.py \
    -a meta_resnet18/50 \
    -d <path-to-ILSVRC2012-data> \
    --epochs 120 \
    --lr-decay cos \
    -c <path-to-save-checkpoints> \
    --sizes <list-of-input-resolutions> \ # default is 224, 192, 160, 128, 96
    -j <num-workers>
    --kd

MobileNetV2

python imagenet.py \
    -a meta_mobilenetv2 \
    -d <path-to-ILSVRC2012-data> \
    --epochs 150 \
    --lr-decay cos \
    --lr 0.05 \
    --wd 4e-5 \
    -c <path-to-save-checkpoints> \
    --sizes <list-of-input-resolutions> \ # default is 224, 192, 160, 128, 96
    -j <num-workers>
    --kd

Testing

Proxy Inference (default)

python imagenet.py \
    -a <arch> \
    -d <path-to-ILSVRC2012-data> \
    --resume <checkpoint-file> \
    --sizes <list-of-input-resolutions> \
    -e
    -j <num-workers>

Arguments are:

checkpoint-file: previously downloaded checkpoint file from here.
list-of-input-resolutions: test resolutions using different privatized BNs.

which gives Table 1 in the main paper and Table 5 in the supplementary materials.

Ideal Inference

Manually set the scale encoding here, which gives the left panel of Table 2 in the main paper.

Uncomment this line in the main script to enable post-hoc BN calibration, which gives the middle panel of Table 2 in the main paper.

Data-Free Ideal Inference

Manually set the scale encoding here and its corresponding shift here, then uncomment this line to replace its above line, which gives Table 6 in the supplementary materials.

Comparison to MutualNet

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution is accpepted to ECCV 2020 as oral, which highly coincides with our SAN regarding the motivation. We provide a head-to-head comparison of top-1 validation accuracy on ImageNet in the following, based on the common MobileNetV2 backbone.

Method	Config (width-resolution)	MFLOPs	Top-1 Acc.
MutualNet SAN	1.0-224 1.0-224	300 300	73.0 72.86
MutualNet SAN	0.9-224 1.0-208	269 270	72.4 72.42
MutualNet SAN	1.0-192 1.0-192	221 221	71.9 72.22
MutualNet SAN	0.9-192 1.0-176	198 195	71.5 71.63
MutualNet SAN	0.75-192 1.0-160	154 154	70.2 71.16
MutualNet SAN	0.9-160 1.0-144	138 133	69.9 69.80
MutualNet SAN	1.0-128 1.0-128	99 99	67.8 69.14
MutualNet SAN	0.85-128 1.0-112	84 82	66.1 66.59
MutualNet SAN	0.7-128 1.0-96	58 56	64.3 65.07

We observe that SAN surpasses MutualNet in most computational resource levels by merely switching the input resolution, without further tuing the network width. More importantly, SAN could perform dynamic inference under the desired computational budget in one run, while MutualNet first output a query table by running all possible configurations and then search the result from the query table.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Li_2020_ECCV,
author = {Li, Duo and Yao, Anbang and Chen, Qifeng},
title = {Learning to Learn Parameterized Classification Networks for Scalable Input Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {August},
year = {2020}
}

Jun	JUL	Aug
	21
2019	2020	2021

d-li14 / SAN

README.md

Scale Adaptive Network

Requirements

Dependency

Dataset

Pre-trained Models

Baseline (individually trained on each resolution)

ResNet-18

ResNet-50

MobileNetV2

SAN

Training

ResNet-18/50

MobileNetV2

Testing

Proxy Inference (default)

Ideal Inference

Data-Free Ideal Inference

Comparison to MutualNet

Citation

About

Releases

Languages

d-li14 / SAN

Join GitHub today

Clone with HTTPS

Launching GitHub Desktop

Launching GitHub Desktop

Launching Xcode

Launching Visual Studio

Latest commit

Git stats

Files

README.md

Scale Adaptive Network

Requirements

Dependency

Dataset

Pre-trained Models

Baseline (individually trained on each resolution)

ResNet-18

ResNet-50

MobileNetV2

SAN

Training

ResNet-18/50

MobileNetV2

Testing

Proxy Inference (default)

Ideal Inference

Data-Free Ideal Inference

Comparison to MutualNet

Citation

About

Topics

Resources

License

Releases

Languages