1 Star 0 Fork 1

武则天 / rethinking-network-pruning

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

Rethinking the Value of Network Pruning

This repository contains the code for reproducing the results, and trained ImageNet models, in the following paper:

Rethinking the Value of Network Pruning. [arXiv] [OpenReview]

Zhuang Liu*, Mingjie Sun*, Tinghui Zhou, Gao Huang, Trevor Darrell (* equal contribution).

ICLR 2019. Also Best Paper Award at NIPS 2018 Workshop on Compact Deep Neural Networks.

Several pruning methods' implementations contained in this repo can also be readily used for other research purposes.

Paper Summary

Fig 1: A typical three-stage network pruning pipeline.

Our paper shows that for structured pruning, training the pruned model from scratch can almost always achieve comparable or higher level of accuracy than the model obtained from the typical "training, pruning and fine-tuning" (Fig. 1) procedure. We conclude that for those pruning methods:

  1. Training a large, over-parameterized model is often not necessary to obtain an efficient final model.
  2. Learned “important” weights of the large model are typically not useful for the small pruned model.
  3. The pruned architecture itself, rather than a set of inherited “important” weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.

Our results suggest the need for more careful baseline evaluations in future research on structured pruning methods.

Fig 2: Difference between predefined and automatically discovered target architectures, in channel pruning. The pruning ratio x is user-specified, while a, b, c, d are determined by the pruning algorithm. Unstructured sparse pruning can also be viewed as automatic. Our finding has different implications for predefined and automatic methods: for a predefined method, it is possible to skip the traditional "training, pruning and fine-tuning" pipeline and directly train the pruned model; for automatic methods, the pruning can be seen as a form of architecture learning.


We also compare with the "Lottery Ticket Hypothesis" (Frankle & Carbin 2019), and find that with optimal learning rate, the "winning ticket" initialization as used in Frankle & Carbin (2019) does not bring improvement over random initialization. For more details please refer to our paper.

Implementation

We evaluated the following seven pruning methods.

  1. L1-norm based channel pruning
  2. ThiNet
  3. Regression based feature reconstruction
  4. Network Slimming
  5. Sparse Structure Selection
  6. Soft filter pruning
  7. Unstructured weight-level pruning

The first six is structured while the last one is unstructured (or sparse). For CIFAR, our code is based on pytorch-classification and network-slimming. For ImageNet, we use the official Pytorch ImageNet training code. The instructions and models are in each subfolder.

For experiments on The Lottery Ticket Hypothesis, please refer to the folder cifar/lottery-ticket.

Our experiment environment is Python 3.6 & PyTorch 0.3.1.

Contact

Feel free to discuss papers/code with us through issues/emails!

sunmj15 at gmail.com
liuzhuangthu at gmail.com

Citation

If you use our code in your research, please cite:

@inproceedings{liu2018rethinking,
  title={Rethinking the Value of Network Pruning},
  author={Liu, Zhuang and Sun, Mingjie and Zhou, Tinghui and Huang, Gao and Darrell, Trevor},
  booktitle={ICLR},
  year={2019}
}
MIT License Copyright (c) 2018 Mingjie Sun Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

网络剪枝 展开 收起
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/wuzetian/rethinking-network-pruning.git
git@gitee.com:wuzetian/rethinking-network-pruning.git
wuzetian
rethinking-network-pruning
rethinking-network-pruning
master

搜索帮助