1 Star 0 Fork 0

暮光(rayping) / es_data_export

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

下一个版本将支持

  • 1、断点续导;
  • 2、配置参数动态输入;
  • 3、支持从一个index导入另一个index,一个集群导入另一个集群ES;
  • 4、写个定时脚本执行,PS:很多小伙伴说他们需要定时导出数据;

About

该工具实现从ES中导出数据,并且可以对导出的数据格式和数据文件做部分自定义,该工具主要使用ES中srcoll接口多线程导出数据.

Design

Base

  • 项目采用 Java 构建。
  • 访问ES部分采用官方 RestClient 构建通信。
  • 数据导出方式为Srcoll方式,多线程的话通过slice对ES的数据切割数据。
  • 线程池用 BlockingQueue 用作队列,如果队列使用完,则获取ES数据线程会阻塞等待新的队列。

TODO LIST

  • 支持导出到文件,支持格式txt,json,sql
  • 支持导出文件对文件进行切割处理,文件大于多少新建写入下一个文件。
  • 支持数据导出到数据库,支持所有主流数据库,连接池使用druid
  • 程序停止后重新启动能从停止的点导出。
  • 配置参数动态输入。
  • 支持从一个index导入另一个index,一个集群导入另一个集群ES。
  • 定时脚本执行程序。

Version

版本号说明:大版本.新增功能.提交次数

V1.3.5

  • 1.新增ES导数据入DB,支持大部分主流数据库。
  • 1.新增支持导出SQL语句自定义sql。
  • 2.修改配置文件的配置名,利于阅读。
  • 3.修改文件分割策略,以文件大小分割取消以文件数量分割。
  • 4.优化代码。

V1.2.4

  • 1.新增线程池监控,在数据导出结束后正确停止程序。
  • 2.新增配置启动前验证配置是否正确,设置配置默认值。
  • 3.优化异常日志输出,更好排查问题。

V1.2.3

  • 1.优化写文件操作,使用BlockingQueue队列缓存。
  • 2.新增支持文件写到一定大小后进行文件切割。
  • 3.新增支持SSL加密获取数据。

V1.2.2

  • 1.重构代码,取消自己封装的HTTP工具,使用官方RestClient工具。
  • 2.新增支持多线程拉取ES数据。

V1.0.1

  • 1.实现单线程导出数据。

Supported

Elasticsearch version support
>= 6.0.0 yes
>= 5.0.0 not test
>= 2.0.0 not test
<= 1 not test

Running

可以直接取build文件夹下已经编译好的包,或者运行以下命令自行编译

$git clone git://github.com/760515805/es_data_export.git
$cd es_data_export

如果已经安装了ant环境和maven环境则可以使用以下操作

$ant 
$cd build
$vim export.properties
$./run.sh

如果只安装了maven环境则如下操作

$mvn clean package
$cp export.properties run.sh stop.sh logback.xml  target/
$cd target
$vim export.properties
$./run.sh

切记修改export.properties文件

Development

1.运行环境

  • IDE:IntelliJ IDEA或者Eclipse
  • 项目构建工具:Maven

2.初始化项目

  • 打开IntelliJ IDEA,将项目导入
  • 修改export.properties文件配置
  • 运行App.java执行

配置文件名词解释

common.thread_size

获取数据线程数据,最大不超过索引的shards数量和CPU数量,默认为1

elasticsearch.index

数据索引

elasticsearch.document_type

 索引type,无则可留空,ES7.0以后删除

elasticsearch.query

查询条件DSL,必须为ES的查询语句,可留空,默认:查询全部,条数1000

elasticsearch.includes

取哪些字段的数据,逗号隔开,如果全部取则设为空

elasticsearch.hosts

ES集群IP地址,逗号隔开,如:192.169.2.98:9200,192.169.2.156:9200,192.169.2.188:9200

elasticsearch.username

如有帐号密码则填写,如果无则留空

elasticsearch.password

如有帐号密码则填写,如果无则留空

elasticsearch.ssl_type

SSL类型

elasticsearch.ssl_keystorepath

密钥地址,文件地址

elasticsearch.ssl_keystorepass

密钥密码

file.enabled

是否启用写文件标志位,默认false

file.datalayout

输出源数据形式,目前支持json、txt、sql,如果为txt字段间是用逗号隔开,默认:json

file.field_split

当datalayout=txt时字段以什么分割,不设置则默认英文逗号隔开

file.field_sort

 当datalayout=txt时字段输出顺序,必须和索引表字段名一样,有效防止数据混乱,逗号隔开逗号隔开

file.need_field_name

当datalayout=txt时是否需要字段名字,默认:false,设置为true时以此以下形式输出类似:fieldName1=fieldValue1,fieldName2=fieldValue2

file.sql_format

当datalayout=txt时所输出的sql格式,如:INSERT INTO test (phone,msgcode) VALUES (#param{phone},123);其中取ES参数为#param{字段名}

file.linefeed

导出数据写入文件每条数据是否需要换行,默认:true

file.filepath

数据输出文件路径,必填字段

file.filename

输出的文件名,无则取默认:index

file.max_filesize

每个文件多大进行分割,需要则设置该项,实际是有误差的,如果不需要分割文件留空即可,单位:KB

file.custom_field_name

自定义字段名,将库里该字段取出来后换为该字段名,原字段名:替换后的字段名,多个逗号隔开,如phone:telphone

db.enabled

是否启用写入数据库

db.jdbc_driver_library

驱动jar包地址,如lib/mysql-connector-java-5.1.47.jar,可以自定义自己数据库版本的驱动包,指定地址即可

db.jdbc_connection_string

数据库连接,如:jdbc:sqlserver://192.169.2.203:1433;DatabaseName=db_phone_sa_center

db.jdbc_driver_class

数据库驱动程序,如:com.mysql.jdbc.Driver

db.jdbc_user

数据库用户名

db.jdbc_password

数据库密码

db.jdbc_template

插入数据模版,其中#param{ES字段}来取ES的值,如:INSERT INTO test111 (name) VALUES (#param{simuid})

db.jdbc_size

单批次最大插入多少,默认10000

db.jdbc_write_thread_size

同时写DB线程数,默认1

线程池设置

关于threadSize设置设置为多少合适,这里给出的权重是 CPU核数>Shards>配置设置 意思是配置的设置不能大于CPU核数也不能大于索引的shards数量。 比如我是8核的机器,shards为15,配置设置20,最后取的线程数是8 如果我是8核的机器,shards为15,配置设置7,最后取的是 7

联系作者

QQ:760515805

wx:chj-95

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

ES数据导出工具,ES data export tool,elasticsearch data export tool,elasticsearch数据导出到文件,elasticsearch数据导出到数据库,目前已支持全部的6.x的版本,后续跟进更低的版本。 展开 收起
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/rayping/es_data_export.git
git@gitee.com:rayping/es_data_export.git
rayping
es_data_export
es_data_export
master

搜索帮助