1 Star 1 Fork 0

Broken / text-detection-ctpn

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

text-detection-ctpn

text detection mainly based on ctpn (connectionist text proposal network). It is implemented in tensorflow. I use id card detect as an example to demonstrate the results, but it should be noticing that this model can be used in almost every horizontal scene text detection task. The origin paper can be found here. Also, the origin repo in caffe can be found in here. For more detail about the paper and code, see this blog. If you got any questions, check the issue first, if the problem persists, open a new issue.


roadmap

  • freeze the graph for convenient inference
  • pure python, cython nms and cuda nms
  • loss function as referred in paper
  • oriented text connector
  • BLSTM

demo

  • for a quick demo,you don't have to build the library, simpely use demo_pb.py for inference.
  • first, git clone git@github.com:eragonruan/text-detection-ctpn.git --depth=1
  • then, download the pb file from release
  • put ctpn.pb in data/
  • put your images in data/demo, the results will be saved in data/results, and run demo in the root
python ./ctpn/demo_pb.py

parameters

there are some parameters you may need to modify according to your requirement, you can find them in ctpn/text.yml

  • USE_GPU_NMS # whether to use nms implemented in cuda or not
  • DETECT_MODE # H represents horizontal mode, O represents oriented mode, default is H
  • checkpoints_path # the model I provided is in checkpoints/, if you train the model by yourself,it will be saved in output/

training

setup

  • requirements: python2.7, tensorflow1.3, cython0.24, opencv-python, easydict,(recommend to install Anaconda)
  • if you do not have a gpu device,follow here to setup
  • if you have a gpu device, build the library by
cd lib/utils
chmod +x make.sh
./make.sh

prepare data

  • First, download the pre-trained model of VGG net and put it in data/pretrain/VGG_imagenet.npy. you can download it from google drive or baidu yun.
  • Second, prepare the training data as referred in paper, or you can download the data I prepared from google drive or baidu yun. Or you can prepare your own data according to the following steps.
  • Modify the path and gt_path in prepare_training_data/split_label.py according to your dataset. And run
cd lib/prepare_training_data
python split_label.py
  • it will generate the prepared data in current folder, and then run
python ToVoc.py
  • to convert the prepared training data into voc format. It will generate a folder named TEXTVOC. move this folder to data/ and then run
cd ../../data
ln -s TEXTVOC VOCdevkit2007

train

Simplely run

python ./ctpn/train_net.py
  • you can modify some hyper parameters in ctpn/text.yml, or just used the parameters I set.
  • The model I provided in checkpoints is trained on GTX1070 for 50k iters.
  • If you are using cuda nms, it takes about 0.2s per iter. So it will takes about 2.5 hours to finished 50k iterations.

some results

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.


oriented text connector

  • oriented text connector has been implemented, i's working, but still need futher improvement.
  • left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O

MIT License Copyright (c) 2017 shaohui ruan Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

暂无描述 展开 收起
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/sindre/text-detection-ctpn.git
git@gitee.com:sindre/text-detection-ctpn.git
sindre
text-detection-ctpn
text-detection-ctpn
master

搜索帮助

53164aa7 5694891 3bd8fe86 5694891