449 Star 3.5K Fork 852

PaddlePaddle / PaddleOCR

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

English | 简体中文 | हिन्दी | 日本語 | 한국인 | Pу́сский язы́к

简介

PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。

📣 近期更新

  • 🔥2023.3.10 PaddleOCR集成了高性能、全场景模型部署方案FastDeploy,欢迎参考指南试用(注意使用dygraph分支)。

  • 📚2022.12 发布《OCR产业范例20讲》电子书,新增蒙古文、身份证、液晶屏缺陷等7个场景应用范例

  • 🔨2022.11 新增实现4种前沿算法:文本检测 DRRG, 文本识别 RFL, 文本超分Text Telescope,公式识别CAN

  • 2022.10 优化JS版PP-OCRv3模型:模型大小仅4.3M,预测速度提升8倍,配套web demo开箱即用

  • 💥 直播回放:PaddleOCR研发团队详解PP-StructureV2优化策略。微信扫描下方二维码,关注公众号并填写问卷后进入官方交流群,获取直播回放链接与20G重磅OCR学习大礼包(内含PDF转Word应用程序、10种垂类模型、《动手学OCR》电子书等)

  • 🔥2022.8.24 发布 PaddleOCR release/2.6

    • 发布PP-StructureV2,系统功能性能全面升级,适配中文场景,新增支持版面复原,支持一行命令完成PDF转Word
    • 版面分析模型优化:模型存储减少95%,速度提升11倍,平均CPU耗时仅需41ms;
    • 表格识别模型优化:设计3大优化策略,预测耗时不变情况下,模型精度提升6%;
    • 关键信息抽取模型优化:设计视觉无关模型结构,语义实体识别精度提升2.8%,关系抽取精度提升9.1%。
  • 2022.8 发布 OCR场景应用集合:包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等9个垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。

  • 2022.8 新增实现8种前沿算法

  • 2022.5.9 发布 PaddleOCR release/2.5

    • 发布PP-OCRv3,速度可比情况下,中文场景效果相比于PP-OCRv2再提升5%,英文场景提升11%,80语种多语言模型平均识别准确率提升5%以上;
    • 发布半自动标注工具PPOCRLabelv2:新增表格文字图像、图像关键信息抽取任务和不规则文字图像的标注功能;
    • 发布OCR产业落地工具集:打通22种训练部署软硬件环境与方式,覆盖企业90%的训练部署环境需求;
    • 发布交互式OCR开源电子书《动手学OCR》,覆盖OCR全栈技术的前沿理论与代码实践,并配套教学视频。

更多

🌟 特性

支持多种OCR相关前沿算法,在此基础上打造产业级特色模型PP-OCRPP-Structure,并打通数据生产、模型训练、压缩、预测部署全流程。

上述内容的使用方法建议从文档教程中的快速开始体验

⚡ 快速开始

📚《动手学OCR》电子书

👫 开源社区

  • 📑项目合作: 如果您是企业开发者且有明确的OCR垂类应用需求,填写问卷后可免费与官方团队展开不同层次的合作。
  • 👫加入社区: 微信扫描二维码并填写问卷之后,加入交流群领取20G重磅OCR学习大礼包
    • 包括《动手学OCR》电子书 ,配套讲解视频和notebook项目;PaddleOCR历次发版直播课回放链接
    • OCR场景应用模型集合: 包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。
    • PDF2Word应用程序;OCR社区优秀开发者项目分享视频。
  • 🏅️社区项目社区项目文档中包含了社区用户使用PaddleOCR开发的各种工具、应用以及为PaddleOCR贡献的功能、优化的文档与代码等,是官方为社区开发者打造的荣誉墙,也是帮助优质项目宣传的广播站。
  • 🎁社区常规赛:社区常规赛是面向OCR开发者的积分赛事,覆盖文档、代码、模型和应用四大类型,以季度为单位评选并发放奖励,赛题详情与报名方法可参考链接

PaddleOCR官方交流群二维码

🛠️ PP-OCR系列模型列表(更新中)

模型简介 模型名称 推荐场景 检测模型 方向分类器 识别模型
中英文超轻量PP-OCRv3模型(16.2M) ch_PP-OCRv3_xx 移动端&服务器端 推理模型 / 训练模型 推理模型 / 训练模型 推理模型 / 训练模型
英文超轻量PP-OCRv3模型(13.4M) en_PP-OCRv3_xx 移动端&服务器端 推理模型 / 训练模型 推理模型 / 训练模型 推理模型 / 训练模型

PaddleOCR场景应用模型

行业 类别 亮点 文档说明 模型下载
制造 数码管识别 数码管数据合成、漏识别调优 光功率计数码管字符识别 下载链接
金融 通用表单识别 多模态通用表单结构化提取 多模态表单识别 下载链接
交通 车牌识别 多角度图像处理、轻量模型、端侧部署 轻量级车牌识别 下载链接
  • 更多制造、金融、交通行业的主要OCR垂类应用模型(如电表、液晶屏、高精度SVTR模型等),可参考场景应用模型下载

📖 文档教程

👀 效果展示 more

PP-OCRv3 中文模型
PP-OCRv3 英文模型
PP-OCRv3 多语言模型
PP-Structure 文档分析
  • 版面分析+表格识别
  • SER(语义实体识别)
  • RE(关系提取)

许可证书

本项目的发布受Apache 2.0 license许可认证。

Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。 展开 收起
Python 等 6 种语言
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Python
1
https://gitee.com/paddlepaddle/PaddleOCR.git
git@gitee.com:paddlepaddle/PaddleOCR.git
paddlepaddle
PaddleOCR
PaddleOCR
main

搜索帮助