1 Star 0 Fork 0

雨落辟湖 / bert4keras

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

bert4keras

说明

这是笔者重新实现的keras版的transformer模型库,致力于用尽可能清爽的代码来实现结合transformer和keras。

本项目的初衷是为了修改、定制上的方便,所以可能会频繁更新。

因此欢迎star,但不建议fork,因为你fork下来的版本可能很快就过期了。

功能

目前已经实现:

  • 加载bert/roberta/albert的预训练权重进行finetune;
  • 实现语言模型、seq2seq所需要的attention mask;
  • 丰富的examples
  • 从零预训练代码(支持TPU、多GPU,请看pretraining);
  • 兼容keras、tf.keras

使用

安装稳定版:

pip install bert4keras

安装最新版:

pip install git+https://www.github.com/bojone/bert4keras.git

使用例子请参考examples目录。

之前基于keras-bert给出的例子,仍适用于本项目,只需要将bert_model的加载方式换成本项目的。

理论上兼容Python2和Python3,兼容tensorflow 1.14+和tensorflow 2.x,实验环境是Python 2.7、Tesorflow 1.14+以及Keras 2.3.1(已经在2.2.4、2.3.0、2.3.1、tf.keras下测试通过)。

为了获得最好的体验,建议你使用Tensorflow 1.14 + Keras 2.3.1组合。

关于环境组合
  • 支持tf+keras和tf+tf.keras,后者需要提前传入环境变量TF_KERAS=1。

  • 当使用tf+keras时,建议2.2.4 <= keras <= 2.3.1,以及 1.14 <= tf <= 2.2,不能使用tf 2.3+。

  • keras 2.4+可以用,但事实上keras 2.4.x基本上已经完全等价于tf.keras了,因此如果你要用keras 2.4+,倒不如直接用tf.keras。

当然,乐于贡献的朋友如果发现了某些bug的话,也欢迎指出修正甚至Pull Requests~

权重

目前支持加载的权重:

注意事项

  • 注1:brightmart版albert的开源时间早于Google版albert,这导致早期brightmart版albert的权重与Google版的不完全一致,换言之两者不能直接相互替换。为了减少代码冗余,bert4keras的0.2.4及后续版本均只支持加载Google版以brightmart版中带Google字眼的权重。如果要加载早期版本的权重,请用0.2.3版本,或者考虑作者转换过的albert_zh
  • 注2:下载下来的ELECTRA权重,如果没有json配置文件的话,参考这里自己改一个(需要加上type_vocab_size字段)。

更新

  • 2021.01.30: 发布0.9.9版,完善多GPU支持,增加多GPU例子:task_seq2seq_autotitle_multigpu.py
  • 2020.12.29: 增加residual_attention_scores参数来实现RealFormer,只需要在build_transformer_model中传入参数residual_attention_scores=True启用。
  • 2020.12.04: PositionEmbedding引入层次分解,可以让BERT直接处理超长文本,在build_transformer_model中传入参数hierarchical_position=True启用。
  • 2020.11.19: 支持GPT2模型,参考CPM_LM_bert4keras项目。
  • 2020.11.14: 新增分参数学习率extend_with_parameter_wise_lr,可用于给每层设置不同的学习率。
  • 2020.10.27: 支持T5.1.1Multilingual T5
  • 2020.08.28: 支持GPT_OpenAI
  • 2020.08.22: 新增WebServing类,允许简单地将模型转换为Web接口,详情请参考该类的说明
  • 2020.07.14: Transformer类加入prefix参数;snippets.py引入to_array函数;AutoRegressiveDecoder修改rtype='logits'时的一个隐藏bug。
  • 2020.06.06: 强迫症作祟:将Tokenizer原来的max_length参数重命名为maxlen,同时保留向后兼容性,建议大家用新参数名。
  • 2020.04.29: 增加重计算(参考keras_recompute),可以通过时间换空间,通过设置环境变量RECOMPUTE=1启用。
  • 2020.04.25: 优化tf2下的表现。
  • 2020.04.16: 所有example均适配tensorflow 2.0。
  • 2020.04.06: 增加UniLM预训练模式(测试中)。
  • 2020.04.06: 完善rematch方法。
  • 2020.04.01: Tokenizer增加rematch方法,给出分词结果与原序列的映射关系。
  • 2020.03.30: 尽量统一py文件的写法。
  • 2020.03.25: 支持ELECTRA。
  • 2020.03.24: 继续加强DataGenerator,允许传入迭代器时进行局部shuffle。
  • 2020.03.23: 增加调整Attention的key_size的选项。
  • 2020.03.17: 增强DataGenerator;优化模型写法。
  • 2020.03.15: 支持GPT2_ML
  • 2020.03.10: 支持Google的T5模型。
  • 2020.03.05: 将tokenizer.py更名为tokenizers.py
  • 2020.03.05: application='seq2seq'改名为application='unilm'
  • 2020.03.05: build_bert_model更名为build_transformer_model
  • 2020.03.05: 重写models.py结构。
  • 2020.03.04: 将bert.py更名为models.py
  • 2020.03.02: 重构mask机制(用回Keras自带的mask机制),以便更好地编写更复杂的应用。
  • 2020.02.22: 新增AutoRegressiveDecoder类,统一处理Seq2Seq的解码问题。
  • 2020.02.19: transformer block的前缀改为Transformer(本来是Encoder),使得其含义局限性更少。
  • 2020.02.13: 优化load_vocab函数;将build_bert_model中的keep_words参数更名为keep_tokens,此处改动可能会对部分脚本产生影响。
  • 2020.01.18: 调整文本处理方式,去掉codecs的使用。
  • 2020.01.17: 各api日趋稳定,为了方便大家使用,打包到pypi,首个打包版本号为0.4.6。
  • 2020.01.10: 重写模型mask方案,某种程度上让代码更为简练清晰;后端优化。
  • 2019.12.27: 重构预训练代码,减少冗余;目前支持RoBERTa和GPT两种预训练方式,详见pretraining
  • 2019.12.17: 适配华为的nezha权重,只需要在build_bert_model函数里加上model='nezha';此外原来albert的加载方式albert=True改为model='albert'
  • 2019.12.16: 通过跟keras 2.3+版本类似的思路给低版本引入层中层功能,从而恢复对低于2.3.0版本的keras的支持。
  • 2019.12.14: 新增Conditional Layer Normalization及相关demo。
  • 2019.12.09: 各example的data_generator规范化;修复application='lm'时的一个错误。
  • 2019.12.05: 优化tokenizer的do_lower_case,同时微调各个example。
  • 2019.11.23: 将train.py重命名为optimizers.py,更新大量优化器实现,全面兼容keras和tf.keras。
  • 2019.11.19: 将utils.py重命名为tokenizer.py。
  • 2019.11.19: 想来想去,最后还是决定把snippets放到bert4keras.snippets下面去好了。
  • 2019.11.18: 优化预训练权重加载逻辑,增加保存模型权重至Bert的checkpoint格式方法。
  • 2019.11.17: 分离一些与Bert本身不直接相关的常用代码片段到python_snippets,供其它项目共用。
  • 2019.11.11: 添加NSP部分。
  • 2019.11.05: 适配google版albert,不再支持非Google版albert_zh
  • 2019.11.05: 以RoBERTa为例子的预训练代码开发完毕,同时支持TPU/多GPU训练,详见roberta。欢迎在此基础上构建更多的预训练代码。
  • 2019.11.01: 逐步增加预训练相关代码,详见pretraining
  • 2019.10.28: 支持使用基于sentencepiece的tokenizer。
  • 2019.10.25: 引入原生tokenizer。
  • 2019.10.22: 引入梯度累积优化器。
  • 2019.10.21: 为了简化代码结构,决定放弃keras 2.3.0之前的版本的支持,目前只支持keras 2.3.0+以及tf.keras。
  • 2019.10.20: 应网友要求,现支持直接用model.save保存模型结构,用load_model加载整个模型(只需要在load_model之前执行from bert4keras.layers import *,不需要额外写custom_objects)。
  • 2019.10.09: 已兼容tf.keras,同时在tf 1.13和tf 2.0下的tf.keras测试通过,通过设置环境变量TF_KERAS=1来切换tf.keras。
  • 2019.10.09: 已兼容Keras 2.3.x,但只是临时方案,后续可能直接移除掉2.3之前版本的支持。
  • 2019.10.02: 适配albert,能成功加载albert_zh的权重,只需要在load_pretrained_model函数里加上albert=True

背景

之前一直用CyberZHG大佬的keras-bert,如果纯粹只是为了在keras下对bert进行调用和fine tune来说,keras-bert已经足够能让人满意了。

然而,如果想要在加载官方预训练权重的基础上,对bert的内部结构进行修改,那么keras-bert就比较难满足我们的需求了,因为keras-bert为了代码的复用性,几乎将每个小模块都封装为了一个单独的库,比如keras-bert依赖于keras-transformer,而keras-transformer依赖于keras-multi-head,keras-multi-head依赖于keras-self-attention,这样一重重依赖下去,改起来就相当头疼了。

所以,我决定重新写一个keras版的bert,争取在几个文件内把它完整地实现出来,减少这些依赖性,并且保留可以加载官方预训练权重的特性。

鸣谢

感谢CyberZHG大佬实现的keras-bert,本实现有不少地方参考了keras-bert的源码,在此衷心感谢大佬的无私奉献。

引用

@misc{bert4keras,
  title={bert4keras},
  author={Jianlin Su},
  year={2020},
  howpublished={\url{https://bert4keras.spaces.ac.cn}},
}

交流

QQ交流群:808623966,微信群请加机器人微信号spaces_ac_cn

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

keras implement of transformers for humans 展开 收起
Python
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/yuluopihu/bert4keras.git
git@gitee.com:yuluopihu/bert4keras.git
yuluopihu
bert4keras
bert4keras
master

搜索帮助