1 Star 0 Fork 45

张小亮 / scr2txt

forked from lazytech / scr2txt 
加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

scr2txt

介绍

通过截屏快速实现图片转文字、图片转表格,基于百度飞桨paddleocr

软件架构

py3.7 实现,基于百度飞桨paddleocr平台,主要采用

  • pyqt
  • pillow

直接下载

  • 1.0 版本 src2txt.zip 提取码: ui9f
    • 支持文字识别
  • 1.1 版本 src2txt_v1.1.zip 提取码: yrmn
    • 支持文字+表格识别
    • 文字识别,
    • 表格识别,
  • win10 测试,其他系统 暂未测试

使用说明

解压缩后,直接运行src2txt.exe

  1. alt+c,选择文字截屏,结果直接保存在剪贴板
  2. alt+t, 选择表格截屏,以excel文件存放在table目录内
  3. alt+q,退出
  • 注意: 第一次运行会在线下载识别模型,需要稍等一会

安装教程

  1. 安装必要依赖包
    pip install -r requirements.txt
    pip install -e packages/Shapely-1.7.1-cp37-cp37m-win_amd64.whl 
    pip install -e packages/layoutparser-0.0.0-py3-none-any.whl
  2. 软件打包
    1. 调试打包
    
    SET PADDLEOCR_PATH=C:\Users\leo\anaconda3\envs\paddleocr\Lib\site-packages
    SET CODE_PATH=C:\workspaces\tools\scr2txt
    
    pyinstaller --clean -y -D --clean --exclude matplotlib -p %PADDLEOCR_PATH%\paddle\libs;%PADDLEOCR_PATH%\paddleocr;%PADDLEOCR_PATH%\paddleocr\ppocr\utils\e2e_utils;%PADDLEOCR_PATH%\paddleocr\ppstructure\table scr2txt.py -i scr2txt.ico --add-binary %PADDLEOCR_PATH%\paddle\libs;. --add-data %CODE_PATH%\scr2txt.ico;. --add-data %PADDLEOCR_PATH%\paddleocr\ppocr\utils\ppocr_keys_v1.txt;.\ppocr\utils --add-data %PADDLEOCR_PATH%\paddleocr\ppocr\utils\dict\table_structure_dict.txt;.\ppocr\utils\dict --add-data %PADDLEOCR_PATH%\layoutparser\misc\NotoSerifCJKjp-Regular.otf;.\layoutparser\misc --additional-hooks-dir=. --hidden-import extract_textpoint_slow --hidden-import tablepyxl --hidden-import tablepyxl.style
    1. 正式打包
    
    SET PADDLEOCR_PATH=C:\Users\leo\anaconda3\envs\paddleocr\Lib\site-packages
    SET CODE_PATH=C:\workspaces\tools\scr2txt
    
    pyinstaller --clean -y -w -F --clean --exclude matplotlib -p %PADDLEOCR_PATH%\paddle\libs;%PADDLEOCR_PATH%\paddleocr;%PADDLEOCR_PATH%\paddleocr\ppocr\utils\e2e_utils;%PADDLEOCR_PATH%\paddleocr\ppstructure\table scr2txt.py -i scr2txt.ico --add-binary %PADDLEOCR_PATH%\paddle\libs;. --add-data %CODE_PATH%\scr2txt.ico;. --add-data %PADDLEOCR_PATH%\paddleocr\ppocr\utils\ppocr_keys_v1.txt;.\ppocr\utils --add-data %PADDLEOCR_PATH%\paddleocr\ppocr\utils\dict\table_structure_dict.txt;.\ppocr\utils\dict --add-data %PADDLEOCR_PATH%\layoutparser\misc\NotoSerifCJKjp-Regular.otf;.\layoutparser\misc --additional-hooks-dir=. --hidden-import extract_textpoint_slow --hidden-import tablepyxl --hidden-import tablepyxl.style --version-file=version.txt
    

其他注意事项


pyinstall 打包问题总结

  • 1 找不到资源问题和matplotlib报错

matplotlib报错,通过 --exclude 屏蔽matplotlib(我的项目不用) 资源找不到,通过打包 --add-binary --add-data 解决

pyinstaller -D -w --clean --exclude matplotlib -p C:\Anaconda2\envs\paddleocr\Lib\site-packages\paddleocr;C:\Anaconda2\envs\paddleocr\Lib\site-packages\paddle\libs textshot.py -i textshot.ico --add-binary C:\Anaconda2\envs\paddleocr\Lib\site-packages\paddle\libs;. --add-data C:\opencode\ocr\textshot_paddle\model;.\model --additional-hooks-dir=.
  • 2 进程无线启动问题

  • 2.1 分析

经过多次排除法尝试,只要存在以下语句"from paddleocr import PaddleOCR"就会导致进程不停启动 通过命令行运行打包进程“txt.exe", 手动强杀进程(Ctrl+C)发现以下报错:

c:\opencode\ocr\textshot_paddle>C:\opencode\ocr\textshot_paddle\dist\txt\txt.exe
Traceback (most recent call last):
  File "txt.py", line 200, in <module>
    out, err = import_cv2_proc.communicate()
  File "subprocess.py", line 964, in communicate
  File "subprocess.py", line 1296, in _communicate
  File "threading.py", line 1044, in join
  File "threading.py", line 1060, in _wait_for_tstate_lock
KeyboardInterrupt
[448] Failed to execute script txt

于是查看 paddle\dataset\image.py 代码,发现200行如下

if six.PY3:
    import subprocess
    import sys
    import_cv2_proc = subprocess.Popen(
        [sys.executable, "-c", "import cv2"],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)
    out, err = import_cv2_proc.communicate()
    retcode = import_cv2_proc.poll()
    if retcode != 0:
        cv2 = None
    else:
        import cv2
else:
    try:
        import cv2
    except ImportError:
        cv2 = None

然后根据pyinstaller issue帖子 40674110分析,怀疑subprocess.Popen导致问题 于是写测试程序,打包测试


import io
import os
import sys

import subprocess
import sys
import_cv2_proc = subprocess.Popen(
[sys.executable, "-c", "import cv2"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
out, err = import_cv2_proc.communicate()
retcode = import_cv2_proc.poll()
if retcode != 0:
    cv2 = None
else:
    import cv2
    
#from paddleocr import PaddleOCR

if __name__ == "__main__":


    print ("is ok!!!!!!!!!!!!!!!!!")
    args = input('input where you think:')
    print (args)

果然,重现问题,无线启动新进程。

  • 2.3 解决方案

解决方案简单粗暴,修改image.py 39行开始代码,屏蔽subprocess调用

# if six.PY3:
#     import subprocess
#     import sys
#     import_cv2_proc = subprocess.Popen(
#         [sys.executable, "-c", "import cv2"],
#         stdout=subprocess.PIPE,
#         stderr=subprocess.PIPE)
#     out, err = import_cv2_proc.communicate()
#     retcode = import_cv2_proc.poll()
#     if retcode != 0:
#         cv2 = None
#     else:
#         import cv2
# else:
#     try:
#         import cv2
#     except ImportError:
#         cv2 = None
try:
    import cv2
except ImportError:
    cv2 = None

问题解决。

如果本软件对你有用,请多多支持,这将使我有更有动力不断完善,谢谢!

avatar

MIT License Copyright (c) 2020 Ian Zhao Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

通过截屏快速实现图片转文字,基于百度飞桨paddleocr,支持打包为exe程序 展开 收起
Python
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Python
1
https://gitee.com/zhangxiaoliang0311X/scr2txt.git
git@gitee.com:zhangxiaoliang0311X/scr2txt.git
zhangxiaoliang0311X
scr2txt
scr2txt
master

搜索帮助

14c37bed 8189591 565d56ea 8189591