同步操作将从 Apache SeaTunnel/SeaTunnel 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of many companies.
SeaTunnel focuses on data integration and data synchronization, and is mainly designed to solve common problems in the field of data integration:
Besides, SeaTunnel provides a Connector API that does not depend on a specific execution engine. Connectors (Source, Transform, Sink) developed based on this API can run on many different engines, such as SeaTunnel Zeta Engine, Flink, Spark that are currently supported.
The runtime process of SeaTunnel is shown in the figure above.
The user configures the job information and selects the execution engine to submit the job.
The Source Connector is responsible for parallelizing the data and sending the data to the downstream Transform or directly to the Sink, and the Sink writes the data to the destination. It is worth noting that both Source and Transform and Sink can be easily developed and extended by yourself.
The default engine use by SeaTunnel is SeaTunnel Engine. If you choose to use the Flink or Spark engine, SeaTunnel will package the Connector into a Flink or Spark program and submit it to Flink or Spark to run.
Source Connectors supported check out
Sink Connectors supported check out
Transform supported check out
Download address for run-directly software package : https://seatunnel.apache.org/download
SeaTunnel uses SeaTunnel Zeta Engine as the runtime execution engine for data synchronization by default. We highly recommend utilizing Zeta engine as the runtime engine, as it offers superior functionality and performance. By the way, SeaTunnel also supports the use of Flink or Spark as the execution engine.
SeaTunnel Zeta Engine https://seatunnel.apache.org/docs/start-v2/locally/quick-start-seatunnel-engine/
Spark https://seatunnel.apache.org/docs/start-v2/locally/quick-start-spark
Flink https://seatunnel.apache.org/docs/start-v2/locally/quick-start-flink
Weibo business uses an internal customized version of SeaTunnel and its sub-project Guardian for SeaTunnel On Yarn task monitoring for hundreds of real-time streaming computing tasks.
Collecting various logs from business services into Apache Kafka, some of the data in Apache Kafka is consumed and extracted through SeaTunnel, and then store into Clickhouse.
Sina Data Operation Analysis Platform uses SeaTunnel to perform real-time and offline analysis of data operation and maintenance for Sina News, CDN and other services, and write it into Clickhouse.
Sogou Qiqian System takes SeaTunnel as an ETL tool to help establish a real-time data warehouse system.
SeaTunnel provides real-time streaming and offline SQL computing of e-commerce user behavior data for Yonghui Life, a new retail brand of Yonghui Yunchuang Technology.
For more use cases, please refer to: https://seatunnel.apache.org/blog
This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please follow the REPORTING GUIDELINES to report unacceptable behavior.
Thanks to all developers!
Please follow this document.
dev-subscribe@seatunnel.apache.org
, follow the reply to subscribe
the mail list.
SeaTunnel enriches the CNCF CLOUD NATIVE Landscape.
Various companies and organizations use SeaTunnel for research, production and commercial products. Visit our website to find the user page.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。