Skip to content

Commit

Permalink
V2.0.0 alpha (#4)
Browse files Browse the repository at this point in the history
* Format with flake8

* Release pretrained model of usnets

* Create MODEL_ZOO.md

* Update README.md
  • Loading branch information
JiahuiYu authored Mar 18, 2019
1 parent 74fc3ff commit f0b8dac
Show file tree
Hide file tree
Showing 16 changed files with 601 additions and 92 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
logs
data
.flake8
19 changes: 19 additions & 0 deletions MODEL_ZOO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Slimmable Model Zoo

## Slimmable Neural Networks ([ICLR 2019](https://arxiv.org/abs/1812.08928))


| Model | Switches (Widths) | Top-1 Err. | MFLOPs | Model ID |
| :--- | :---: | :---: | ---: | :---: |
| S-MobileNet v1 | 1.00<br>0.75<br>0.50<br>0.25 | 28.5<br>30.5<br>35.2<br>46.9 | 569<br>325<br>150<br>41 | [a6285db](https://github.com/JiahuiYu/slimmable_networks/files/2709079/s_mobilenet_v1_0.25_0.5_0.75_1.0.pt.zip) |
| S-MobileNet v2 | 1.00<br>0.75<br>0.50<br>0.35 | 29.5<br>31.1<br>35.6<br>40.3 | 301<br>209<br>97<br>59 | [0593ffd](https://github.com/JiahuiYu/slimmable_networks/files/2709080/s_mobilenet_v2_0.35_0.5_0.75_1.0.pt.zip) |
| S-ShuffleNet | 2.00<br>1.00<br>0.50 | 28.6<br>34.5<br>42.8 | 524<br>138<br>38 | [1427f66](https://github.com/JiahuiYu/slimmable_networks/files/2709082/s_shufflenet_0.5_1.0_2.0.pt.zip) |
| S-ResNet-50 | 1.00<br>0.75<br>0.50<br>0.25 | 24.0<br>25.1<br>27.9<br>35.0 | 4.1G<br>2.3G<br>1.1G<br>278 | [3fca9cc](https://drive.google.com/open?id=1f6q37OkZaz_0GoOAwllHlXNWuKwor2fC) |


## Universally Slimmable Networks and Improved Training Techniques ([Preprint](https://arxiv.org/abs/1903.05134))

| Model | Widths | Top-1 Err. | MFLOPs | Model ID |
| :--- | :--- | :---: | ---: | :---: |
| US-MobileNet v1 | 1.0<br> 0.975<br> 0.95<br> 0.925<br> 0.9<br> 0.875<br> 0.85<br> 0.825<br> 0.8<br> 0.775<br> 0.75<br> 0.725<br> 0.7<br> 0.675<br> 0.65<br> 0.625<br> 0.6<br> 0.575<br> 0.55<br> 0.525<br> 0.5<br> 0.475<br> 0.45<br> 0.425<br> 0.4<br> 0.375<br> 0.35<br> 0.325<br> 0.3<br> 0.275<br> 0.25 | 28.2<br> 28.3<br> 28.4<br> 28.7<br> 28.7<br> 29.1<br> 29.4<br> 29.7<br> 30.2<br> 30.3<br> 30.5<br> 30.9<br> 31.2<br> 31.7<br> 32.2<br> 32.5<br> 33.2<br> 33.7<br> 34.4<br> 35.0<br> 35.8<br> 36.5<br> 37.3<br> 38.1<br> 39.0<br> 40.0<br> 41.0<br> 41.9<br> 42.7<br> 44.2<br> 44.3 | 568<br> 543<br> 517<br> 490<br> 466<br> 443<br> 421<br> 389<br> 366<br> 345<br> 325<br> 306<br> 287<br> 267<br> 249<br> 232<br> 217<br> 201<br> 177<br> 162<br> 149<br> 136<br> 124<br> 114<br> 100<br> 89<br> 80<br> 71<br> 64<br> 48<br> 41 | [13d5af2](https://github.com/JiahuiYu/slimmable_networks/files/2979952/us_mobilenet_v1_calibrated.pt.zip) |
| US-MobileNet v2 | 1.0<br> 0.975<br> 0.95<br> 0.925<br> 0.9<br> 0.875<br> 0.85<br> 0.825<br> 0.8<br> 0.775<br> 0.75<br> 0.725<br> 0.7<br> 0.675<br> 0.65<br> 0.625<br> 0.6<br> 0.575<br> 0.55<br> 0.525<br> 0.5<br> 0.475<br> 0.45<br> 0.425<br> 0.4<br> 0.375<br> 0.35 | 28.5<br> 28.5<br> 28.8<br> 28.9<br> 29.1<br> 29.1<br> 29.4<br> 29.9<br> 30.0<br> 30.2<br> 30.4<br> 30.7<br> 31.1<br> 31.4<br> 31.7<br> 31.7<br> 32.4<br> 32.4<br> 34.4<br> 34.6<br> 34.9<br> 35.1<br> 35.8<br> 35.8<br> 36.6<br> 36.7<br> 37.7<br> | 300<br> 299<br> 284<br> 274<br> 269<br> 268<br> 254<br> 235<br> 222<br> 213<br> 209<br> 185<br> 173<br> 165<br> 161<br> 161<br> 151<br> 150<br> 106<br> 100<br> 97<br> 96<br> 88<br> 88<br> 80<br> 80<br> 59 | [3880cad](https://github.com/JiahuiYu/slimmable_networks/files/2979953/us_mobilenet_v2_calibrated.pt.zip) |
34 changes: 22 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,25 @@
# Slimmable Neural Networks
# Slimmable Networks

[ICLR 2019 Paper](https://arxiv.org/abs/1812.08928) | [ArXiv](https://arxiv.org/abs/1812.08928) | [OpenReview](https://openreview.net/forum?id=H1gMCsAqY7) | [Detection](https://github.com/JiahuiYu/slimmable_networks/tree/detection) | [Model Zoo](#model-zoo) | [BibTex](#citing)
An open-source framework for slimmable training on ImageNet classification and COCO detection, which has enabled numerous projects.

## [Slimmable Neural Networks](https://arxiv.org/abs/1812.08928)

[ICLR 2019 Paper](https://arxiv.org/abs/1812.08928) | [OpenReview](https://openreview.net/forum?id=H1gMCsAqY7) | [Detection](https://github.com/JiahuiYu/slimmable_networks/tree/detection) | [Model Zoo](/MODEL_ZOO.md) | [BibTex](#citing)

<img src="https://user-images.githubusercontent.com/22609465/50390872-1b3fb600-0702-11e9-8034-d0f41825d775.png" width=95%/>

Illustration of slimmable neural networks. The same model can run at different widths (number of active channels), permitting instant and adaptive accuracy-efficiency trade-offs.


## [Universally Slimmable Networks and Improved Training Techniques](https://arxiv.org/abs/1903.05134)

[Preprint](https://arxiv.org/abs/1903.05134) | [Model Zoo](/MODEL_ZOO.md) | [BibTex](#citing)

<img src="https://user-images.githubusercontent.com/22609465/54562571-45b5ae00-4995-11e9-8984-49e32d07e325.png" width=95%/>

Illustration of universally slimmable networks. The same model can run at **arbitrary** widths.


## Run

0. Requirements:
Expand All @@ -22,16 +35,6 @@ Illustration of slimmable neural networks. The same model can run at different w
* If you still have questions, please search closed issues first. If the problem is not solved, please open a new.


## Model Zoo

| Model | Switches (Widths) | Top-1 Err. | MFLOPs | Model ID |
| :--- | :---: | :---: | ---: | :---: |
| S-MobileNet v1 | 1.00<br>0.75<br>0.50<br>0.25 | 28.5<br>30.5<br>35.2<br>46.9 | 569<br>325<br>150<br>41 | [a6285db](https://github.com/JiahuiYu/slimmable_networks/files/2709079/s_mobilenet_v1_0.25_0.5_0.75_1.0.pt.zip) |
| S-MobileNet v2 | 1.00<br>0.75<br>0.50<br>0.35 | 29.5<br>31.1<br>35.6<br>40.3 | 301<br>209<br>97<br>59 | [0593ffd](https://github.com/JiahuiYu/slimmable_networks/files/2709080/s_mobilenet_v2_0.35_0.5_0.75_1.0.pt.zip) |
| S-ShuffleNet | 2.00<br>1.00<br>0.50 | 28.6<br>34.5<br>42.8 | 524<br>138<br>38 | [1427f66](https://github.com/JiahuiYu/slimmable_networks/files/2709082/s_shufflenet_0.5_1.0_2.0.pt.zip) |
| S-ResNet-50 | 1.00<br>0.75<br>0.50<br>0.25 | 24.0<br>25.1<br>27.9<br>35.0 | 4.1G<br>2.3G<br>1.1G<br>278 | [3fca9cc](https://drive.google.com/open?id=1f6q37OkZaz_0GoOAwllHlXNWuKwor2fC) |


## Technical Details

Implementing slimmable networks and slimmable training is straightforward:
Expand All @@ -54,4 +57,11 @@ The software is for educaitonal and academic research purpose only.
journal={arXiv preprint arXiv:1812.08928},
year={2018}
}
@article{yu2019universally,
title={Universally Slimmable Networks and Improved Training Techniques},
author={Yu, Jiahui and Huang, Thomas},
journal={arXiv preprint arXiv:1903.05134},
year={2019}
}
```
64 changes: 64 additions & 0 deletions apps/us_mobilenet_v1_val.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# =========================== Basic Settings ===========================
# machine info
num_gpus_per_job: 4 # number of gpus each job need
num_cpus_per_job: 63 # number of cpus each job need
memory_per_job: 380 # memory requirement each job need
gpu_type: "nvidia-tesla-p100"

# data
dataset: imagenet1k
data_transforms: imagenet1k_basic
data_loader: imagenet1k_basic
dataset_dir: data/imagenet
data_loader_workers: 62

# info
num_classes: 1000
image_size: 224
topk: [1, 5]
num_epochs: 100

# optimizer
optimizer: sgd
momentum: 0.9
weight_decay: 0.0001
nesterov: True

# lr
lr: 0.1
lr_scheduler: multistep
multistep_lr_milestones: [30, 60, 90]
multistep_lr_gamma: 0.1

# model profiling
profiling: [gpu]

# pretrain, resume, test_only
pretrained: ''
resume: ''
test_only: False

#
random_seed: 1995
batch_size: 256
model: ''
reset_parameters: True


# =========================== Override Settings ===========================
log_dir: logs/
slimmable_training: True
model: models.us_mobilenet_v1
width_mult: 1.0
width_mult_list: [0.25, 0.275, 0.3, 0.325, 0.35, 0.375, 0.4, 0.425, 0.45, 0.475, 0.5, 0.525, 0.55, 0.575, 0.6, 0.625, 0.65, 0.675, 0.7, 0.725, 0.75, 0.775, 0.8, 0.825, 0.85, 0.875, 0.9, 0.925, 0.95, 0.975, 1.0]
width_mult_range: [0.25, 1.0]
data_transforms: imagenet1k_mobile
# num_gpus_per_job:
# lr:
# lr_scheduler:
# exp_decaying_lr_gamma:
# num_epochs:
# batch_size:
# test pretrained
test_only: True
pretrained: logs/us_mobilenet_v1_calibrated.pt
64 changes: 64 additions & 0 deletions apps/us_mobilenet_v2_val.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# =========================== Basic Settings ===========================
# machine info
num_gpus_per_job: 4 # number of gpus each job need
num_cpus_per_job: 63 # number of cpus each job need
memory_per_job: 380 # memory requirement each job need
gpu_type: "nvidia-tesla-p100"

# data
dataset: imagenet1k
data_transforms: imagenet1k_basic
data_loader: imagenet1k_basic
dataset_dir: data/imagenet
data_loader_workers: 62

# info
num_classes: 1000
image_size: 224
topk: [1, 5]
num_epochs: 100

# optimizer
optimizer: sgd
momentum: 0.9
weight_decay: 0.0001
nesterov: True

# lr
lr: 0.1
lr_scheduler: multistep
multistep_lr_milestones: [30, 60, 90]
multistep_lr_gamma: 0.1

# model profiling
profiling: [gpu]

# pretrain, resume, test_only
pretrained: ''
resume: ''
test_only: False

#
random_seed: 1995
batch_size: 256
model: ''
reset_parameters: True


# =========================== Override Settings ===========================
log_dir: logs/
slimmable_training: True
model: models.us_mobilenet_v2
width_mult: 1.0
width_mult_list: [0.35, 0.375, 0.4, 0.425, 0.45, 0.475, 0.5, 0.525, 0.55, 0.575, 0.6, 0.625, 0.65, 0.675, 0.7, 0.725, 0.75, 0.775, 0.8, 0.825, 0.85, 0.875, 0.9, 0.925, 0.95, 0.975, 1.0]
width_mult_range: [0.35, 1.0]
data_transforms: imagenet1k_mobile
# num_gpus_per_job:
# lr:
# lr_scheduler:
# exp_decaying_lr_gamma:
# num_epochs:
# batch_size:
# test pretrained
test_only: True
pretrained: logs/us_mobilenet_v2_calibrated.pt
26 changes: 5 additions & 21 deletions models/s_mobilenet_v1.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,10 @@


from .slimmable_ops import SwitchableBatchNorm2d
from .slimmable_ops import SlimmableConv2d, SlimmableLinear
from .slimmable_ops import SlimmableConv2d, SlimmableLinear, make_divisible
from utils.config import FLAGS


def _make_divisible(v, divisor=8, min_value=8):
"""
forked from slim:
https://github.com/tensorflow/models/blob/\
0344c5503ee55e24f0de7f37336a6e08f10976fd/\
research/slim/nets/mobilenet/mobilenet.py#L62-L69
"""
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
# Make sure that round down does not go down by more than 10%.
if new_v < 0.9 * v:
new_v += divisor
return new_v


class DepthwiseSeparableConv(nn.Module):
def __init__(self, inp, outp, stride):
super(DepthwiseSeparableConv, self).__init__()
Expand Down Expand Up @@ -63,10 +47,10 @@ def __init__(self, num_classes=1000, input_size=224):
# head
assert input_size % 32 == 0
channels = [
_make_divisible(32 * width_mult)
make_divisible(32 * width_mult)
for width_mult in FLAGS.width_mult_list]
self.outp = [
_make_divisible(1024 * width_mult)
make_divisible(1024 * width_mult)
for width_mult in FLAGS.width_mult_list]
first_stride = 2
self.features.append(
Expand All @@ -81,7 +65,7 @@ def __init__(self, num_classes=1000, input_size=224):
# body
for c, n, s in self.block_setting:
outp = [
_make_divisible(c * width_mult)
make_divisible(c * width_mult)
for width_mult in FLAGS.width_mult_list]
for i in range(n):
if i == 0:
Expand All @@ -92,7 +76,7 @@ def __init__(self, num_classes=1000, input_size=224):
DepthwiseSeparableConv(channels, outp, 1))
channels = outp

avg_pool_size = input_size//32
avg_pool_size = input_size // 32
self.features.append(nn.AvgPool2d(avg_pool_size))

# make it nn.Sequential
Expand Down
27 changes: 6 additions & 21 deletions models/s_mobilenet_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,10 @@


from .slimmable_ops import SwitchableBatchNorm2d, SlimmableConv2d
from .slimmable_ops import make_divisible
from utils.config import FLAGS


def _make_divisible(v, divisor=8, min_value=1):
"""
forked from slim:
https://github.com/tensorflow/models/blob/\
0344c5503ee55e24f0de7f37336a6e08f10976fd/\
research/slim/nets/mobilenet/mobilenet.py#L62-L69
"""
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
# Make sure that round down does not go down by more than 10%.
if new_v < 0.9 * v:
new_v += divisor
return new_v


class InvertedResidual(nn.Module):
def __init__(self, inp, outp, stride, expand_ratio):
super(InvertedResidual, self).__init__()
Expand All @@ -31,7 +16,7 @@ def __init__(self, inp, outp, stride, expand_ratio):

layers = []
# expand
expand_inp = [i*expand_ratio for i in inp]
expand_inp = [i * expand_ratio for i in inp]
if expand_ratio != 1:
layers += [
SlimmableConv2d(inp, expand_inp, 1, 1, 0, bias=False),
Expand Down Expand Up @@ -80,9 +65,9 @@ def __init__(self, num_classes=1000, input_size=224):
# head
assert input_size % 32 == 0
channels = [
_make_divisible(32 * width_mult)
make_divisible(32 * width_mult)
for width_mult in FLAGS.width_mult_list]
self.outp = _make_divisible(
self.outp = make_divisible(
1280 * max(FLAGS.width_mult_list)) if max(
FLAGS.width_mult_list) > 1.0 else 1280
first_stride = 2
Expand All @@ -98,7 +83,7 @@ def __init__(self, num_classes=1000, input_size=224):
# body
for t, c, n, s in self.block_setting:
outp = [
_make_divisible(c * width_mult)
make_divisible(c * width_mult)
for width_mult in FLAGS.width_mult_list]
for i in range(n):
if i == 0:
Expand All @@ -120,7 +105,7 @@ def __init__(self, num_classes=1000, input_size=224):
nn.ReLU6(inplace=True),
)
)
avg_pool_size = input_size//32
avg_pool_size = input_size // 32
self.features.append(nn.AvgPool2d(avg_pool_size))

# make it nn.Sequential
Expand Down
6 changes: 3 additions & 3 deletions models/s_resnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def __init__(self, inp, outp, stride):
super(Block, self).__init__()
assert stride in [1, 2]

midp = [i//4 for i in outp]
midp = [i // 4 for i in outp]
layers = [
SlimmableConv2d(inp, midp, 1, 1, 0, bias=False),
SwitchableBatchNorm2d(midp),
Expand Down Expand Up @@ -79,7 +79,7 @@ def __init__(self, num_classes=1000, input_size=224):
# body
for stage_id, n in enumerate(self.block_setting):
outp = [
int(feats[stage_id]*width_mult*4)
int(feats[stage_id] * width_mult * 4)
for width_mult in FLAGS.width_mult_list]
for i in range(n):
if i == 0 and stage_id != 0:
Expand All @@ -88,7 +88,7 @@ def __init__(self, num_classes=1000, input_size=224):
self.features.append(Block(channels, outp, 1))
channels = outp

avg_pool_size = input_size//32
avg_pool_size = input_size // 32
self.features.append(nn.AvgPool2d(avg_pool_size))

# make it nn.Sequential
Expand Down
Loading

0 comments on commit f0b8dac

Please sign in to comment.