代码拉取完成,页面将自动刷新
为解决业务系统数据库存在大表问题,采用时间分表和冷热分表方法,开发的ETL工具。
/ #根目录
../csx-bsf-archive #归档程序
../csx-bsf-archive-demo #程序demo
../pom.xml #父级POM文件
../readme.md
<dependency>
<artifactId>csx-bsf-archive</artifactId>
<version>1.0.0-SNAPSHOT</version>
<groupId>com.yh.csx</groupId>
</dependency>
配置文件
application.yml 数据源配置
##调度器,可以采用xxljob和spring
##如果采用spring,需要配置表的cron表达式
etl.config.scheduler.name=xxljob
etl.config.datasource.basicDS.type = com.alibaba.druid.pool.DruidDataSource
etl.config.datasource.basicDS.driver-class-name = com.mysql.cj.jdbc.Driver
etl.config.datasource.basicDS.url = jdbc:mysql://10.252.193.28:3306/test?useUnicode=true&characterEncoding=UTF8&useSSL=false&allowMultiQueries=true&autoReconnect=true
etl.config.datasource.basicDS.username =
etl.config.datasource.basicDS.password =
etl.config.datasource.basicDS.filters = stat
etl.config.datasource.basicDS.max-active = 30
etl.config.datasource.basicDS.initial-size = 5
etl.config.datasource.basicDS.max-wait = 60000
etl.config.datasource.basicDS.time-between-eviction-runs-millis = 60000
etl.config.datasource.basicDS.min-evictable-idle-time-millis = 300000
etl.config.datasource.basicDS.validation-query = SELECT 'x'
etl.config.datasource.basicDS.test-while-idle = true
etl.config.datasource.basicDS.test-on-borrow = false
etl.config.datasource.basicDS.test-on-return = false
etl.config.datasource.basicDS.pool-prepared-statements = true
etl.config.datasource.basicDS.max-open-prepared-statements = 20
etl.config.datasource.main0.type = com.alibaba.druid.pool.DruidDataSource
etl.config.datasource.main0.driver-class-name = com.mysql.cj.jdbc.Driver
etl.config.datasource.main0.url = jdbc:mysql://10.252.193.28:3306/test?useUnicode=true&characterEncoding=UTF8&useSSL=false&allowMultiQueries=true&autoReconnect=true
etl.config.datasource.main0.username =
etl.config.datasource.main0.password =
etl.config.datasource.main0.filters = stat
etl.config.datasource.main0.max-active = 30
etl.config.datasource.main0.initial-size = 5
etl.config.datasource.main0.max-wait = 60000
etl.config.datasource.main0.time-between-eviction-runs-millis = 60000
etl.config.datasource.main0.min-evictable-idle-time-millis = 300000
etl.config.datasource.main0.validation-query = SELECT 'x'
etl.config.datasource.main0.test-while-idle = true
etl.config.datasource.main0.test-on-borrow = false
etl.config.datasource.main0.test-on-return = false
etl.config.datasource.main0.pool-prepared-statements = true
etl.config.datasource.main0.max-open-prepared-statements = 20
数据表配置
在resource目录下etl增加表配置文件({表名}.yml),例如:md_product_region.yml 同步多个表,就创建多个表配置
dataSourceKey: basicDS #原表数据源,配置第2部分数据源的键值
groupId: csx_basic_data_3-md_product_region #键值
outerSourceKey: main0 #目标数据源,配置第2部分数据源的键值
concurrent: false #是否支持并发处理
cron: "*/20 * * * * ?" #如果调度器采用spring(不支持分布式),需配置此项。默认采用xxljob分布式调度器
dbMapping:
database: csx_basic_data_3 #源数据库名
table: md_product_region #源表名,(如果是原表已采用分表(异构一样),物理表名逗号隔开,targetTable配置为空)
relationLayer: 0 #关联表,主表为小数字,从关联表为大数字;归档顺序为先从表后主表
targetTable: md_product_region #目标表名
targetPk:
id: id ##主键字段
mapAll: true ##是否匹配左右字段
keepPeriod: 30day ##源数据表保留时长,例如1day,1year,2month
etlPeriod: 1day ##处理数据时间段长度,1day,2hour
splitType: monthly ##分表策略,solid,monthly,yearly,quartly
splitSuffixFormat: YYYY_MM ## 分表后缀规则,例如生成表 md_product_region_2020_03
targetColumns:
id: id
product_code: product_code
region_code : region_code
regionalized_trade_names: regionalized_trade_names
delivery_type: delivery_type
big_piece_number: big_piece_number
small_piece_number: small_piece_number
break_number: break_number
must_sale_flag: must_sale_flag
origin_descript: origin_descript
product_attribute: product_attribute
package_num: package_num
create_time: create_time
update_time: update_time
create_by: create_by
update_by: update_by
etlCondition: " where update_time<={} and update_time>={}" ## 处理条件
commitBatch: 300 # 批量提交的大小
4,配置XXLJOB A,创建执行器(与已有task的执行器的名字一样); B, 配置调度器 BEAN:etlHandler
5, BSF相关使用,请参阅https://gitee.com/yhcsx/csx-bsf-all
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。