1 Star 0 Fork 1

modelee / 1wnr382e

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
language thumbnail tags widget license datasets metrics
en
https://repository-images.githubusercontent.com/401779782/c2f46be5-b74b-4620-ad64-57487be3b1ab
text2sql
How many singers do we have? | concert_singer | stadium : stadium_id, location, name, capacity, highest, lowest, average | singer : singer_id, name, country, song_name, song_release_year, age, is_male | concert : concert_id, concert_name, theme, stadium_id, year | singer_in_concert : concert_id, singer_id
apache-2.0
spider
spider

tscholak/1wnr382e

Fine-tuned weights for PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models based on T5-Large.

Training Data

The model has been fine-tuned on the 7000 training examples in the Spider text-to-SQL dataset. The model solves Spider's zero-shot text-to-SQL translation task, and that means that it can generalize to unseen SQL databases.

Training Objective

This model was initialized with T5-Large and fine-tuned with the text-to-text generation objective.

Questions are always grounded in a database schema, and the model is trained to predict the SQL query that would be used to answer the question. The input to the model is composed of the user's natural language question, the database identifier, and a list of tables and their columns:

[question] | [db_id] | [table] : [column] ( [content] , [content] ) , [column] ( ... ) , [...] | [table] : ... | ...

The model outputs the database identifier and the SQL query that will be executed on the database to answer the user's question:

[db_id] | [sql]

Performance

Out of the box, this model achieves 65.3 % exact-set match accuracy and 67.2 % execution accuracy on the Spider development set.

Using the PICARD constrained decoding method (see the official PICARD implementation), the model's performance can be improved to 69.1 % exact-set match accuracy and 72.9 % execution accuracy on the Spider development set.

Usage

Please see the official repository for scripts and docker images that support evaluation and serving of this model.

References

  1. PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

  2. Official PICARD code

Citation

@inproceedings{Scholak2021:PICARD,
  author = {Torsten Scholak and Nathan Schucher and Dzmitry Bahdanau},
  title = "{PICARD}: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models",
  booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
  month = nov,
  year = "2021",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2021.emnlp-main.779",
  pages = "9895--9901",
}

空文件

简介

暂无描述 展开 收起
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/modelee/1wnr382e.git
git@gitee.com:modelee/1wnr382e.git
modelee
1wnr382e
1wnr382e
main

搜索帮助