title | weight | type | aliases | |
---|---|---|---|---|
Data Pipeline |
1 |
docs |
|
Since events in Flink CDC flow from the upstream to the downstream in a pipeline manner, the whole ETL task is referred as a Data Pipeline.
A pipeline corresponds to a chain of operators in Flink.
To describe a Data Pipeline, the following parts are required:
the following parts are optional:
We could use following yaml file to define a concise Data Pipeline describing synchronize all tables under MySQL app_db database to Doris :
source:
type: mysql
hostname: localhost
port: 3306
username: root
password: 123456
tables: app_db.\.*
sink:
type: doris
fenodes: 127.0.0.1:8030
username: root
password: ""
pipeline:
name: Sync MySQL Database to Doris
parallelism: 2
We could use following yaml file to define a complicated Data Pipeline describing synchronize all tables under MySQL app_db database to Doris and give specific target database name ods_db and specific target table name prefix ods_ :
source:
type: mysql
hostname: localhost
port: 3306
username: root
password: 123456
tables: app_db.\.*
sink:
type: doris
fenodes: 127.0.0.1:8030
username: root
password: ""
route:
- source-table: app_db.orders
sink-table: ods_db.ods_orders
- source-table: app_db.shipments
sink-table: ods_db.ods_shipments
- source-table: app_db.products
sink-table: ods_db.ods_products
pipeline:
name: Sync MySQL Database to Doris
parallelism: 2
The following config options of Data Pipeline level are supported:
parameter | meaning | optional/required |
---|---|---|
name | The name of the pipeline, which will be submitted to the Flink cluster as the job name. | optional |
parallelism | The global parallelism of the pipeline. | required |
local-time-zone | The local time zone defines current session time zone id. | optional |
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。