1 Star 0 Fork 1

modelee / distilbart-mnli-12-9

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
datasets tags pipeline_tag
mnli
distilbart
distilbart-mnli
zero-shot-classification

DistilBart-MNLI

distilbart-mnli is the distilled version of bart-large-mnli created using the No Teacher Distillation technique proposed for BART summarisation by Huggingface, here.

We just copy alternating layers from bart-large-mnli and finetune more on the same data.

matched acc mismatched acc
bart-large-mnli (baseline, 12-12) 89.9 90.01
distilbart-mnli-12-1 87.08 87.5
distilbart-mnli-12-3 88.1 88.19
distilbart-mnli-12-6 89.19 89.01
distilbart-mnli-12-9 89.56 89.52

This is a very simple and effective technique, as we can see the performance drop is very little.

Detailed performace trade-offs will be posted in this sheet.

Fine-tuning

If you want to train these models yourself, clone the distillbart-mnli repo and follow the steps below

Clone and install transformers from source

git clone https://github.com/huggingface/transformers.git
pip install -qqq -U ./transformers

Download MNLI data

python transformers/utils/download_glue_data.py --data_dir glue_data --tasks MNLI

Create student model

python create_student.py \
  --teacher_model_name_or_path facebook/bart-large-mnli \
  --student_encoder_layers 12 \
  --student_decoder_layers 6 \
  --save_path student-bart-mnli-12-6 \

Start fine-tuning

python run_glue.py args.json

You can find the logs of these trained models in this wandb project.

空文件

简介

取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/modelee/distilbart-mnli-12-9.git
git@gitee.com:modelee/distilbart-mnli-12-9.git
modelee
distilbart-mnli-12-9
distilbart-mnli-12-9
main

搜索帮助

344bd9b3 5694891 D2dac590 5694891