1 Star 0 Fork 23

王霆宇 / openGauss-tools-datachecker-performance

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.en.md 7.93 KB
一键复制 编辑 原始数据 按行查看 历史

gs_datachecker

Description

Opengauss data verification tool, including full data verification and incremental data verification.

Software Architecture

Full data verification: JDBC is used to extract the source and target data, and the extraction results are temporarily stored in Kafka. The verification service obtains the extraction results of the specified table from Kafka through the extraction service for verification. Finally, output the verification results to the file in the specified path.

Incremental data verification, through debezium monitoring the data change records of the source database, the extraction service regularly processes the change records of debezium according to a certain frequency, and makes statistics on the change records. Send the statistical results to the data verification service. The data verification service initiates incremental data verification and outputs the verification results to the specified path file.

Installation environment requirements:

JDK11+
Install kafka (start zookeeper and kafka service)

Installation

  1. Download and start Kafka
  2. Obtain the data verification service jar package and the configuration file template (datachecker-check.jar/datachecker-extract.jar, application.yml, application sink.yml, application source.yml)
  3. Copy the jar package and configuration file to the specified server directory, configure the relevant configuration file, and start the corresponding jar service.

Instructions

Start zookeeper

cd {path}/confluent-7.2.0

Start the ZooKeeper service

bin/zookeeper-server-start etc/kafka/zookeeper.properties
or
bin/zookeeper-server-start -daemon etc/kafka/zookeeper.properties

Start Kafka

Start the Kafka broker service

bin/kafka-server-start  etc/kafka/server.properties

bin/kafka-server-start -daemon etc/kafka/server.properties

Start kafka connect (incremental check)

# New connect configuration
vi etc/kafka/mysql-conect.properties

name=mysql-connect-all
connector.class=io.debezium.connector.mysql.MySqlConnector
database.hostname=
database.port=3306
database.user=root
database.password=test@123
database.server.id=1
database.server.name=mysql_debezium_connect-all
database.whitelist=test
database.history.kafka.bootstrap.servers=
database.history.kafka.topic=mysql_test_topic-all
include.schema.changes=true
transforms=Reroute
transforms.Reroute.type=io.debezium.transforms.ByLogicalTableRouter
transforms.Reroute.topic.regex=(.*)test(.*)
transforms.Reroute.topic.replacement=data_check_test_all

# Start the Kafka connect service 
bin/connect-standalone -daemon etc/kafka/connect-standalone.properties etc/kafka/mysql-conect.properties

Verify service startup configuration

Verify the service configuration and modify the application.yml file
	server.port is the verification service web port, which can not be modified by default
	logging.config is the absolute path of the extraction service log path config/log4j2.xml file
	bootstrap-servers is the working address of kafka, and the default installation can not be modified
	data.check.data-path is the output address of the verification result, and the default configuration can not be modified
	data.check.source-uri the source side service request address, and the default configuration can not be modified
	data.check.sink-uri is the service request address of the target end, and the default configuration can not be modified

Source side service startup configuration

Source side service configuration modification application-source.yml file
	server.port is the source side extraction service web port, which can not be modified by default
	logging.config is the absolute path of the extraction service log path config/log4j2source.xml file
	spring.check.server-uri is the verification service request address, and the default configuration can not be modified
	spring.extract.schema is the current validation data schema, and the name of the MySQL database
	bootstrap-servers is the working address of kafka, which can not be modified by default
	
	Data Source Configuration

Target side service startup configuration

Target side service configuration modification application-sink.yml file
	server.port  is the sink side extraction service web port, which can not be modified by default
	logging.config is the absolute path of the extraction service log path config/log4j2sink.xml file
	spring.check.server-uri is the verification service request address, and the default configuration can not be modified
	spring.extract.schema is the current validation data schema, and the name of the MySQL database
	bootstrap-servers is the working address of kafka, which can not be modified by default
	
	Data Source Configuration

Start datachecker performance service

use extract-endpoints shell command to start the source and sink service
sh extract-endpoints.sh start|stop|restart 
use check-endpoint shell command to start the check service
sh check-endpoint.sh start|stop|restart 

The extraction service must be started first, and then the verification service.

Background start command

nohup java  -jar datachecker-extract-0.0.1.jar --source  >/dev/null 2>&1 &

nohup java  -jar datachecker-extract-0.0.1.jar --sink >/dev/null 2>&1 &

nohup java  -jar datachecker-check-0.0.1.jar >/dev/null 2>&1 &

After the verification service is fully started, a verification request will be automatically initiated

**remarks: **

1. Single instance verification uses sh script to start the verification service. If verification needs to be started in parallel, copy the current working directory file. After reconfiguration, use the java background startup command.

2. After the extraction service is started, it will automatically load the table related information of the database. If the data volume is large, the data loading will be time-consuming.

3. After the validation service is started, it will detect whether the table data information on the extraction end has been loaded. If the loading is not completed within a certain period of time, the validation service will automatically exit. At this time, you need to query the table information loading progress of the source and destination, and view the loading progress through the log information. Or restart the verification service directly.

4. The incremental verification service is started, and the source side configuration file  config  application source needs to be modified Debezium enable: true in yml and configure other debezium related configurations. Start the service to start the incremental verification service

Developer local startup service

Add virtual machine parameter VM option in startup configuration:

Source side extraction service
-Dspring.config.additional-location=.\config\application-source.yml

Destination extraction service
-Dspring.config.additional-location=.\config\application-sink.yml

check service
-Dspring.config.additional-location=.\config\application.yml

Limits and Constraints

JDK version requirements JDK11+
The current version only supports the verification of source MySQL and target openGauss data
The current version only supports data verification, not table object verification
MYSQL requires version 5.7+
The current version does not support the verification of geographic location geometry data

Contribution

  1. Fork the repository
  2. Create Feat_xxx branch
  3. Commit your code
  4. Create Pull Request

Gitee Feature

  1. You can use Readme_XXX.md to support different languages, such as Readme_en.md, Readme_zh.md
  2. Gitee blog blog.gitee.com
  3. Explore open source project https://gitee.com/explore
  4. The most valuable open source project GVP
  5. The manual of Gitee https://gitee.com/help
  6. The most popular members https://gitee.com/gitee-stars/
1
https://gitee.com/wang-tingyuwty/openGauss-tools-datachecker-performance.git
git@gitee.com:wang-tingyuwty/openGauss-tools-datachecker-performance.git
wang-tingyuwty
openGauss-tools-datachecker-performance
openGauss-tools-datachecker-performance
master

搜索帮助

53164aa7 5694891 3bd8fe86 5694891